jorainer / ensembldb

This is the ensembldb development repository.
https://jorainer.github.io/ensembldb
33 stars 10 forks source link

genomeToProtein not working on GRanges object with >1 range #67

Closed ibwoo closed 6 years ago

ibwoo commented 6 years ago

If I pass a GRanges object to genomeToProtein that has been subset to contain even just two ranges, the function completes without error and but warnings and all of the start and end values in the result are -1. e.g. "Warning message: Transcript(s) 'rs12879019.ENST00000337425', 'rs12879019.ENST00000380365' ... could not be found "

If I run the function on just GRange object subset to a single range then the function works perfectly.

I am also able to first perform genomeToTranscript, and then lapply on the resulting IRangesList with genomeToProtein.

I believe that within the genomeToProtein function, it merges the name of each IRanges object within the IRangesList (provided internally by genomeToTranscript) with the transcript ids. It's then clearly failing to find these merged names within the edb object.

jorainer commented 6 years ago

thanks for reporting. The functions don't throw an error if one (or all) mappings fails but return ranges with negative coordinates for failed mappings.

Could you please provide the code that produces this error? so I can reproduce and fix. thanks!

jorainer commented 6 years ago

Problem should be fixed in version 2.3.5. Could you please verify @ibwoo ? (you can install the updated version with devtools::install_github("jotsetung/ensembldb") or from BioC devel, but it could take several days until it's packaged there.

ibwoo commented 6 years ago

Thanks Johannes, I've finally had a chance to put my data through it and it works as expected. Great work!

jorainer commented 6 years ago

good to hear that it's useful for you. I'm closing the issue. Feel free to re-open if needed.