thelovelab / tximeta

Transcript quantification import with automatic metadata detection
https://thelovelab.github.io/tximeta/
64 stars 11 forks source link

tximeta couldn't find matching transcriptome, returning non-ranged SummarizedExperiment #37

Closed AdeleMangelinck closed 4 years ago

AdeleMangelinck commented 4 years ago

Good afternoon,

I tried to import Salmon (v0.14.1) data via tximeta but got the error message: "couldn't find matching transcriptome, returning non-ranged SummarizedExperiment"

For info: I created the decoy-containing index as recommended in: https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/ based on GRCh37.primary_assembly.genome.fa.gz and gencode.v30.transcripts.fa.gz

meta_info.json file is: { "salmon_version": "0.14.1", "samp_type": "none", "opt_type": "vb", "quant_errors": [], "num_libraries": 1, "library_types": [ "ISR" ], "frag_dist_length": 1001, "seq_bias_correct": false, "gc_bias_correct": false, "num_bias_bins": 4096, "mapping_type": "mapping", "num_valid_targets": 207749, "num_decoy_targets": 25, "num_eq_classes": 482041, "serialized_eq_classes": false, "eq_class_properties": [ "range_factorized" ], "length_classes": [ 522, 674, 1082, 2354, 191154276 ], "index_seq_hash": "b838e8d90e0717e0c535b76426265fed98681b179190b88415bf8461ee768748", "index_name_hash": "74f2b5c5db8bd73b8879c17fc16046a17820f6e756df124e007b63696cb2ce73", "index_seq_hash512": "74712e797260e64026ef6546af70d64fb0d19b8592cbdaf4ae8e2ae8df3f17ac4fae5ce56f01f100f79a8a8d94ffa9a3902d9ac29874775a5c62b1ef44cc0bc6", "index_name_hash512": "4075d849c6b761a9b5753f2aaf213c0dfcfdae3e94139565ea42a719fb6316d71ce99a01709ac4c0cacf56a54d3cd149e04d21a64a136f645967e18fe00559ab", "num_bootstraps": 0, "num_processed": 40772522, "num_mapped": 32227711, "num_decoy_fragments": 5397177, "num_dovetail_fragments": 160045, "num_fragments_filtered_vm": 5430861, "num_alignments_below_threshold_for_mapped_fragments_vm": 33439397, "percent_mapped": 79.04272146814955, "call": "quant", "start_time": "Mon Jun 22 08:33:14 2020", "end_time": "Mon Jun 22 11:39:25 2020" }

R version is 3.6.3 tximeta version is 1.4.5

Thank you in advance for your help.

Adèle

mikelove commented 4 years ago

You are using an older version of tximeta (and Bioconductor), which is older than the release of Gencode.

Out-of-date Bioconductor packages cannot modified (e.g. I can't change the hash table of last years version of tximeta to include new Gencode release hashes).

See other similar threads:

https://github.com/mikelove/tximeta/issues/35 https://github.com/mikelove/tximeta/issues/34 https://github.com/mikelove/tximeta/issues/33 ...

AdeleMangelinck commented 4 years ago

Thanks but as I mentioned I am using the latest version of tximeta (1.4.5) and an old version of the gencode annotation (GRCh37.primary_assembly.genome.fa.gz and gencode.v30.transcripts.fa.gz). So, your suggestion does not seem to explain the problem.

mikelove commented 4 years ago

tximeta is 1.6 now, but you are right that Gencode v30 is old and it should have been recognized by 1.4 as well. Sorry I misdiagnosed the issue.

I think this relates to another issue actually with Salmon's treatment of decoys from a brief period in 2019.

I believe it is the case that Salmon/alevin v0.14.1 modified the index_seq_hash with inclusion of decoys but that in the current release of Salmon this is fixed. Can you use the latest Salmon/alevin (this is recommended anyway as it has performance improvements over 0.14)?

AdeleMangelinck commented 4 years ago

Thanks, I will try with a more recent version of salmon.

AdeleMangelinck commented 4 years ago

I worked! Thanks very much!

lulumagic7 commented 1 year ago

Hi Mike,

I tried to import Salmon (1.7.0) data via tximeta but got the error message: "couldn't find matching transcriptome, returning non-ranged SummarizedExperiment"

For info: The decoy-containing index was created as recommended in: https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/ based on GRCh38.primary_assembly.genome.fa.gz and gencode.v42.transcripts.fa.gz

meta_info.json file is: { "salmon_version": "1.7.0", "samp_type": "none", "opt_type": "vb", "quant_errors": [], "num_libraries": 1, "library_types": [ "ISR" ], "frag_dist_length": 1001, "frag_length_mean": 206.4325264121549, "frag_length_sd": 47.40314406386443, "seq_bias_correct": false, "gc_bias_correct": true, "num_bias_bins": 4096, "mapping_type": "mapping", "keep_duplicates": false, "num_valid_targets": 251550, "num_decoy_targets": 194, "num_eq_classes": 419171, "serialized_eq_classes": false, "eq_class_properties": [ "range_factorized", "gzipped" ], "length_classes": [ 549, 772, 1395, 2643, 347561 ], "index_seq_hash": "adf85ccb8ecd0dd263461e763872a802ae20f0258a96e4dc95506c36de54d9a7", "index_name_hash": "f0dc84141a658e75c3a77ce593b599b401b98f79c18efbd62484901f7d62bb3f", "index_seq_hash512": "aed780be81faf3d7e60a111ccdb6c5d9210efc13245431849dd69b898c969b43b48e08b403bbb28298959c0dea42987eb2f701ae472b8b796b748be496534abb", "index_name_hash512": "e860b9f24f60a391d94422283c9d156dfde95d972e31ee7a56c2b1fcf5e115af353675305610e7037708def8b73afe7c5ac6f47cfff2d32f64441fc54e8481aa", "index_decoy_seq_hash": "b87b7a94564c31d78da407e60a0aeb310b2dbdb398e0ddb39392286aaf2fe88c", "index_decoy_name_hash": "5fc84a462ccf4735efdc48604a520ee47e1319156c2a5c252f5daba1e99bb401", "num_bootstraps": 0, "num_processed": 27690073, "num_mapped": 19586701, "num_decoy_fragments": 1084602, "num_dovetail_fragments": 2584499, "num_fragments_filtered_vm": 2330601, "num_alignments_below_threshold_for_mapped_fragments_vm": 18011090, "percent_mapped": 70.73546176638827, "call": "quant", "start_time": "Mon Nov 28 17:35:37 2022", "end_time": "Mon Nov 28 17:41:16 2022" }

R version is 4.2.1 tximeta version is 1.14.1

Thanks so much in advance for your help!

Lu

mikelove commented 1 year ago

See here:

https://github.com/mikelove/tximeta/issues/69

lulumagic7 commented 1 year ago

Thanks so much! it works!