prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.04k stars 5.37k forks source link

presto plugin developing problems #7586

Closed ffpeng90 closed 7 years ago

ffpeng90 commented 7 years ago

Hi,all: I am trying to build a new presto-plugin in my independent project( not in presto-root project ). So i add some maven dependencies into my pom, and build out my plugin.

 ` <dependency>
        <groupId>com.facebook.presto</groupId>
        <artifactId>presto-spi</artifactId>
        <scope>provided</scope>
    </dependency>

   <dependency>
        <groupId>io.airlift</groupId>
        <artifactId>slice</artifactId>
        <scope>provided</scope>
    </dependency>`

My 1th problem is: how can i debug my plugin code without deploying my plugin into directory "plugins" on presto cluster? Can i start-up a local presto server in my IDE to debug my code?

My 2th problem is: I build a presto-plugin to read data from a new file format like "carbondata", and most columns in "carbondata" format is decoded in global dictionary.
now i'm using interface RecordSet to get all decoded records. However in some cases, we do not need the decoded step when doing some aggregation jobs,
So is there any optimizations in presto can delay the decoding process?

kokosing commented 7 years ago

Ad. 1. You can do:

Ad. 2. I haven't heard of anything like that. However you will get a list of columns (projection) which are going to be used. That way you don't need to decode values for columns that are not going to be used.

ffpeng90 commented 7 years ago

Ad. 1 seems great!!! Thanks for your suggestion.

For Ad. 2 I offer some usage scenario, for example: select col1, count(*) from tableA group by col1; In this case, presto have to decode all data in this column, then push these data into aggregation function. The decoding process will cost a lot of time which will result in a long query. Is there any optimization for this case?

kokosing commented 7 years ago

Ad 2. In your case you should only decode col1. count(*) does not need to read your data it just counts the rows

dain commented 7 years ago

You should also delay decoding columns until Presto asks for a column. For example, a common query is:

SELECT * FROM table t WHERE someVeryRareCondition(t.x)

If someVeryRareCondition never returns true, then Presto will only ask for data from column x.

If your datasource has column oriented, then you will want to use the PageSource API, which is more efficient to Presto. For lazy decoding in PageSource, we use LazyBlock.

-dain

On Mar 15, 2017, at 12:09 AM, Grzegorz Kokosiński notifications@github.com wrote:

Ad 2. In your case you should only decode col1. count(*) does not need to read your data it just counts the rows

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

ffpeng90 commented 7 years ago

i am trying this suggestion, thanks kokosing&dain.

ffpeng90 commented 7 years ago

Hi, @kokosing I introduce the pom of presto-rest-slack. and try to run a test Case, However i meet a problem below: Do you met it before? 200) Error injecting constructor, java.lang.IncompatibleClassChangeError: org/objectweb/asm/tree/ClassNode at com.facebook.presto.metadata.MetadataManager.<init>(MetadataManager.java:139) at com.facebook.presto.server.ServerMainModule.setup(ServerMainModule.java:314) while locating com.facebook.presto.metadata.MetadataManager at com.facebook.presto.server.ServerMainModule.setup(ServerMainModule.java:315) while locating com.facebook.presto.metadata.Metadata

And My console output a log continusly: 2017-03-16 11:51:42 WARNING Error fetching node state from http://127.0.0.1:37526/v1/info/state: Server refused connection: http://127.0.0.1:37526/v1/info/state

kokosing commented 7 years ago

No, I have not got this. Make sure you have recompiled everything and try to use the same versions for dependencies as Presto. It seems that that some dependencies are incompatible.

electrum commented 7 years ago

Can you post your POM in a Gist? https://gist.github.com/

ffpeng90 commented 7 years ago

I have post my pom here: https://gist.github.com/ffpeng90/546052ecda4e9a2044f875db682aab34

losipiuk commented 7 years ago

@ffpeng90 I suggest you close this issue (as this is not an issue but rather a question) and move to https://groups.google.com/forum/#!forum/presto-users. That is more proper place for discussion like that.

ffpeng90 commented 7 years ago

ok,i'm going to do this.

At 2017-03-24 06:31:01, "Łukasz Osipiuk" notifications@github.com wrote:

@ffpeng90 I suggest you close this issue (as this is not an issue but rather a question) and move to https://groups.google.com/forum/#!forum/presto-users.

That is more proper place for discussion like that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.