Closed Mkddb closed 6 months ago
Hello Mkd,
Thank you so much for this report!! Indeed, the morbid gene annotations in $ANNOTSV/share/AnnotSV/AnnotationsHuman/FtIncludedInSV/PathogenicSV/GRCh38/pathogenic*_SV_GRCh38.sorted.bed are wrong... Unfortunately, I'm out of my new lab until September. I will do my best to push a new version with correct GRCh38 morbid genes coordinates. Really sorry. I will get back to you asap.
Véronique
Hello Veronique,
This would be really helpful to have a fix with the new version at the earliest possible. Meanwhile, can you suggest an alternative way to check such wrong entries for Pathogenic SVs, since I have 100s of variants to screen through for now? or do I have to manually check for each entry?
Looking forward to the fix.
Thanks again, Mkd
Hi Mkb, I couldn't leave such a bug, just added a fix. You need to update AnnotSV to v3.1.2. I'm really sorry for the inconvenience. Let me know if everything is working now (should be) Best, Véronique
Hello Veronique,
Thanks for the fix and update. Everything else looks fine now, except for one discrepancy.
Sorry to disturb you again. There's a discrepancy for one variant: chr19:39727656-39735580, Deletion on Hg38 coordinates. (Job ID: AnnotSV_AQMOqJLGYB in hg38). There's no overlapped P_loss_source for it, and the gene Gene affected is CLC (CHARCOT-LEYDEN CRYSTAL PROTEIN). but the OMIM annotation in AnnotSV appeared as Cold-induced sweating syndrome 2, 610313 (3) AR, which is linked with a different gene CLCF1 (CARDIOTROPHIN-LIKE CYTOKINE FACTOR 1).
Can you please check for the same and find a fix for it ?
With kind regards, Mkd
Hi Mkd,
Great if the fix works! Thanks for the feedback. Regarding your discrepancy, I don't think it's a bug. If you look at the OMIM ID 607672, CLC and CLCF1 are alternative gene symbols. So everything seems fine to me.
Best, Véronique
That's a bit strange. Apart from the full gene names, the coordinates for CLC (OMIM: 153310, 19:39,731,255-39,738,029) and for CLCF1 (OMIM: 607672, 11:67,364,168-67,374,177) are also very different. Even on different chromosomes.
Hi Véronique,
Sorry to dig up this old thread, but I recently encountered the same issue.
I have a variant on hg38 that was annotated with OMIM information from hg19. The variant in question is chr17:3284806-3411401 INV
and the annotations are P_loss_phen
and P_loss_source morbid:ASPA
. If you look at the genomic locus on hg38, it does not contain the ASPA gene. However, in hg19 this gene is located in the locus.
grep -i canavan Annotations_Human/FtIncludedInSV/PathogenicSV/GRCh38/pathogenic_Loss_SV_GRCh38.sorted.bed
grep -i canavan Annotations_Human/FtIncludedInSV/PathogenicSV/GRCh37/pathogenic_Loss_SV_GRCh37.sorted.bed
both return the same coordinates
17 3379290 3406699 Canavan disease, 271900 (3) AR morbid:ASPA 17:3379290-3406699
Would really appreciate it if you could help fix the annotation sources.
Thank you, Tejas
Hi Tejas,
Thanks for reporting with this specific “ASPA” example. New annotations are expected to be released in January. I keep this bug in mind to check the new update.
Sorry for the delay, I'm chasing time...
Best, Véronique
I have added a fix for misleading OMIM annotation in the dev branch. This will be distributed soon with the next annotation release.
@Mkddb
chr19:39727656-39735580
=> Ok, no more bad OMIM_ID annotation (607672 no longer reported)
@tejas-j Still in process (not a misleading OMIM )
Hi Véronique,
That sounds great. Thanks for the fix for the misleading OMIM annotation. Looking forward for the next annotation release.
Best regards, MKd
@tejas-j
Variant on hg38 annotated with OMIM information from hg19: grep -i canavan Annotations_Human/FtIncludedInSV/PathogenicSV/GRCh38/pathogenic_Loss_SV_GRCh38.sorted.bed grep -i canavan Annotations_Human/FtIncludedInSV/PathogenicSV/GRCh37/pathogenic_Loss_SV_GRCh37.sorted.bed Both return the same coordinates: 17 3379290 3406699 Canavan disease, 271900 (3) AR morbid:ASPA 17:3379290-3406699
I have added a fix for this bug in the dev branch. This will be distributed soon with the next annotation release. Thank you very much for the report and sorry for the long delay.
AnnotSV 3.4 is posted. Let me know if everything works well on your side.
Hello Veronique,
Hope you are doing great.
In annotations, I have noted that certain P_loss_coordinates are coming from Hg19 reference coordinates, despite I have provided the Hg38 mode in the annotation. For example, I annotated this deletion on chr11_63361790_63445978, DEL in Hg38 mode (Job ID: AnnotSV_z2NapwjvZt in hg38). In the "pathogenic_SV" column, P_loss_source is morbid ATL3 and the coordinates are 11:63391558-63439446 (which are on Hg19 reference). Although, the Hg38 coordinates for ATL3 gene are chr11:63,624,087-63,671,974. Which appears to be outside the coordinates of my deletion and might not be having this gene as P_loss_source.
Can you please check this and help me out for its better understanding.
Thanks in advance, Mkd