DEIB-GECO / GMQL-WEB

GMQL WEB Interface
http://www.bioinformatics.deib.polimi.it/geco/?home
Apache License 2.0
5 stars 3 forks source link

UCSC Genome Browser connection #43

Closed marcomass closed 6 years ago

marcomass commented 7 years ago

Fix issues in visualizing datasets in UCSC Genome Browser, by adding unique ID-value pair in each "group" attribute (more test on this aspect should be done)

marcomass commented 7 years ago

@acanakoglu I checked with dataset of a single sample and it worked, but with a dataset with more samples not. Furthermore you mentioned that dataset in GTF format are not well managed; please check.

For testing (multiple sample dataset in gdm format) I used my dataset which is the result of the following query: TEAD4_rep_broad_all = SELECT(project == "ENCODE" AND assembly == "hg19" AND assay == "ChIP-seq" AND output_type == "peaks" AND experiment_target == "TEAD4-human") HG19_ENCODE_BROAD_AUG_2017; MATERIALIZE TEAD4_rep_broad_all into TEAD4_rep_broad_all;

TEAD4_rep_broad_Ishi = SELECT(project == "ENCODE" AND assembly == "hg19" AND assay == "ChIP-seq" AND output_type == "peaks" AND experiment_target == "TEAD4-human" AND biosample_term_name == "Ishikawa") HG19_ENCODE_BROAD_AUG_2017; MATERIALIZE TEAD4_rep_broad_Ishi into TEAD4_rep_broad_Ishi;

TEAD4_rep_broad_NOIshi = SELECT(project == "ENCODE" AND assembly == "hg19" AND assay == "ChIP-seq" AND output_type == "peaks" AND experiment_target == "TEAD4-human"; semijoin: biosample_term_name NOT IN TEAD4_rep_broad_Ishi) HG19_ENCODE_BROAD_MAY_2017; MATERIALIZE TEAD4_rep_broad_NOIshi into TEAD4_rep_broad_NOIshi;

acanakoglu commented 6 years ago

I checked it is working. If it takes time out, try to open genome browser.

marcomass commented 6 years ago

@acanakoglu

Please use the following query to test the issue we were talking about, regarding job_test_join_issue53_guest_new671_20171002_161653_PROM_TSS;

PROM = SELECT(annotation_type == "promoter") HG19_BED_ANNOTATION; TSS = SELECT(annotation_type == "TSS") HG19_BED_ANNOTATION; PROM_TSS = JOIN(DL(0); output: LEFT) PROM TSS; MATERIALIZE PROM_TSS INTO PROM_TSS; TSS_PROM = JOIN(DL(0); output: RIGHT) PROM TSS; MATERIALIZE TSS_PROM INTO TSS_PROM;

PROM_TSSd = JOIN(DL(0); output: LEFT_DISTINCT) PROM TSS; MATERIALIZE PROM_TSSd INTO PROM_TSSd; TSS_PROMd = JOIN(DL(0); output: RIGHT_DISTINCT) PROM TSS; MATERIALIZE TSS_PROMd INTO TSS_PROMd;

PROM_TSSboth = JOIN(DL(0); output: BOTH) PROM TSS; MATERIALIZE PROM_TSSboth INTO PROM_TSSboth;

marcomass commented 6 years ago

There is still a issue, probably due to the default value that you set for score attribute (which should be numeric, not .), when a dataset has no score attribute in its schema.

Please use the following query for testing DATA_SET_VAR = SELECT(biospecimen_aliquot__bcr_sample_barcode == "tcga-01-0642-11a") HG19_TCGA_cnv; MATERIALIZE DATA_SET_VAR INTO RESULT_DS;

and see error message: Expecting number field 5 line 1 of http://genomic.elet.polimi.it/gmql-rest-test/datasets/test_ucsc_20171128_184729_RESULT_DS/S_00000/region?authToken=419d1f99-71f8-4503-9e80-73a9c8969483&bed6=true , got .