KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
113 stars 27 forks source link

refseq annotation #88

Closed EugeneEA closed 6 months ago

EugeneEA commented 2 years ago

Hi, I was asked by my colleges to add a refseq annotation to oc. I looked into it and was surprised that I was not able to find a built in annotator for it though it seems as a viable pice of info (for literature search for example).

I can (and probably will) write it myself, but want to bring your attention to the lack of such annotator. It might work for at least some transcripts, which belongs to 'MAIN' set. (https://github.com/KarchinLab/open-cravat/issues/67)

Best, Eugene

rkimoakbioinformatics commented 2 years ago

Hi @EugeneEA, thanks for bringing attention to the issue. As you found, OC does not have a RefSeq mapper yet. The quickest fix for this would be to start with MANE transcripts. I have just published a mapper module "gencode" (v33.0.0), which has a separate column for RefSeq transcripts and is exactly the same as the latest hg38 for the other columns. It can be installed with oc module install gencode. To use this mapper module, get the conf folder with oc config system and see conf_dir. Open cravat.yml in the conf_dir and change genemapper from hg38 to gencode. Then, use oc run as usual. gencode module will be used instead of hg38 mapper module. In the result, RefSeq column will exist.

Let me know if this works for your purpose (do you need RefSeq for all the Ensembl transcripts in "all mapping" column?). I am considering a dedicated RefSeq mapper in the future.

cheanney commented 2 years ago

Hi @rkimoakbioinformatics, I am trying to switch over to open-cravat from Annovar for variant annotations. It would be greatly appreciated if there is dedicated RefSeq mapper. Since most of our users use RefSeq IDs.

Thanks,

Anney

cheanney commented 2 years ago

Hi @EugeneEA,

Do you have any plans on updating Gencode to a newer version? Currently, Gencode HG38 is on V40.

Thanks,

Anney

lmanchon commented 1 year ago

--Hi,

if i change from hg38 to gencode in /usr/local/lib/python3.8/dist-packages/cravat/conf/cravat.yml it failed with:

Running reporter... Text Reporter (textreporter) Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/cravat/cravat_class.py", line 1619, in run_reporter response_t = await reporter.run() File "/usr/local/lib/python3.8/dist-packages/cravat/cravat_report.py", line 560, in run await self.run_level(level) File "/usr/local/lib/python3.8/dist-packages/cravat/cravat_report.py", line 386, in run_level if len(colinfo_col["reportsub"]) > 0: KeyError: 'reportsub' Finished with an exception. Runtime: 4.026s

kmoad commented 1 year ago

We are currently updating the mapper to use genocode release 43.

We are also exploring adding refseq annotation, but can't give a projected release date at this time.