DEIB-GECO / GMQL

GMQL - GenoMetric Query Language
http://www.bioinformatics.deib.polimi.it/geco/
Apache License 2.0
18 stars 11 forks source link

problems on JOIN (genometric predicate) #74

Closed sunbrn closed 6 years ago

sunbrn commented 7 years ago

I am obtaining wrong results with this query

DATA_SET_VAR = SELECT() A; DATA_SET_VAR2 = SELECT() B; RES = JOIN(DISTANCE<0;output:LEFT_DISTINCT) DATA_SET_VAR DATA_SET_VAR2; MATERIALIZE RES INTO C;

Regardless of the output option

image Black is DATA_SET_VAR Brown is DATA_SET_VAR2 Blue is RES

Why are not the two input regions on the right considered in the result?

@andreagulino

akaitoua commented 7 years ago

@andreagulino , This is a containing region. Did the last fix affected considering the containing region, Please check it out.

andreagulino commented 7 years ago

@sunbrn, do you have the samples in the figure?

sunbrn commented 7 years ago

job_join_issue74_bernasconi_20171011_230336_mydsaJOIN_LEFTDIST.zip job_join_issue74_bernasconi_20171011_230336_mydsaJOIN_LEFTDIST_INPUT.zip job_join_issue74_bernasconi_20171011_230336_mydsaJOIN_LEFTDIST_INPUT2.zip

Yes, I joined the ones with name ending with INPUT and INPUT2 with the statement: RES = JOIN(DISTANCE<0;output:LEFT_DISTINCT) DATA_SET_VAR DATA_SET_VAR2;

In the .zip ending with LEFTDIST you find the output

sunbrn commented 7 years ago

@andreagulino What is the situation at the moment? Please let me know when it is fixed

andreagulino commented 7 years ago

@akaitoua I didn't do any change to the Join implementation so far

akaitoua commented 6 years ago

The regions are distinct in each sample. The duplication is shown int he above figure because this is a result of two files. I fixed the problem of the containing region. Now you can use a negative distance to represent a containing region.