fakedrtom / SVAFotate

MIT License
39 stars 2 forks source link

liftover to hs1 genome #14

Closed yusmiatiliau closed 1 year ago

yusmiatiliau commented 1 year ago

Hi SVAFotate author,

Thank you for writing this very useful tool to annotate SV. I have SV calls from cuteSV, and mapped to hs1 genome that I would like to annotate. So, I took your hg38 annotation file and tried to liftover it to hs1. However, SVAFotate doesn't seem to like it and returned error. I wonder if you could share the command you use for liftover, so I can reproduce them?

I ran the liftover like this:

./liftover \
SVAFotate_core_SV_popAFs.GRCh38.bed \
hg38-chm13v2.over.chain.gz \
SVAFotate_core_SV_popAFs.lift_to_hs1.bed \
SVAFotate_core_SV_popAFs.GRCh38_unmapped.bed \
-bedPlus=3

This returned no header and then I use awk to add header.

when I ran SVAfotate, it came back with this error:

Traceback (most recent call last):
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/bin/svafotate", line 8, in <module>
    sys.exit(main())
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/lib/python3.8/site-packages/svafotate/__main__.py", line 41, in main
    args.func(parser, args)
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/lib/python3.8/site-packages/svafotate/svafotate_main.py", line 499, in annotate
    sources, datas, bed_lists, bed_headers, cov_lists, uniq_lists = process_bed_source(args.bed,covAF,uniqAF,size_limit)
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/lib/python3.8/site-packages/svafotate/utils.py", line 34, in process_bed_source
    bed_headers.append(header[4])
IndexError: list index out of range

Any advice would be much appreciated. Thank you, Cen

fakedrtom commented 1 year ago

Thanks for trying SVAFotate! Sounds like a useful thing to create a T2T version of the SVAFotate bed file. I think that liftover looks correct. Did you use the resulting SVAFotate_core_SV_popAFs.lift_to_hs1.bed as the input -b for SVAFotate? I suspect it may not be formatted in the way that SVAFotate expects. Would you mind sharing the SVAFotate command you used and also just check the start of that bed file to make sure it has the same format as SVAFotate_core_SV_popAFs.GRCh38.bed?

yusmiatiliau commented 1 year ago

Hi, Thank you for your reply. I found out what's wrong with the my bed file, it was to do with the separator with the header. I fixed that and it actually runs now, but I have some other errors. This time I think related to the cuteSV vcf file (but I might be wrong).

This are first two lines of the supporting bed header:

#CHROM  START   END     SVLEN   SVTYPE  SOURCE  SV_ID   AF      HomRef  Het     HomAlt  Male_AF Male_HomRef    Male_Het        Male_HomAlt     Male_HemiAlt    Male_HemiAF     Female_AF       Female_HomRef Female_Het       Female_HomAlt   AFR_AF  AFR_HomRef      AFR_Het AFR_HomAlt      AFR_Male_AF     AFR_Male_HomRef        AFR_Male_Het    AFR_Male_HomAlt AFR_Male_HemiAlt        AFR_Male_HemiAF AFR_Female_AF AFR_Female_HomRef        AFR_Female_Het  AFR_Female_HomAlt       AMR_AF  AMR_HomRef      AMR_Het AMR_HomAlt     AMR_Male_AF     AMR_Male_HomRef AMR_Male_Het    AMR_Male_HomAlt AMR_Male_HemiAlt        AMR_Male_HemiAF        AMR_Female_AF   AMR_Female_HomRef       AMR_Female_Het  AMR_Female_HomAlt       EAS_AFEAS_HomRef       EAS_Het EAS_HomAlt      EAS_Male_AF     EAS_Male_HomRef EAS_Male_Het    EAS_Male_HomAltEAS_Male_HemiAlt        EAS_Male_HemiAF EAS_Female_AF   EAS_Female_HomRef       EAS_Female_Het  EAS_Female_HomAlt      EUR_AF  EUR_HomRef      EUR_Het EUR_HomAlt      EUR_Male_AF     EUR_Male_HomRef EUR_Male_Het   EUR_Male_HomAlt EUR_Male_HemiAlt        EUR_Male_HemiAF EUR_Female_AF   EUR_Female_HomRef     EUR_Female_Het   EUR_Female_HomAlt       OTH_AF  OTH_HomRef      OTH_Het OTH_HomAlt      OTH_Male_AF   OTH_Male_HomRef  OTH_Male_Het    OTH_Male_HomAlt OTH_Male_HemiAlt        OTH_Male_HemiAF OTH_Female_AF OTH_Female_HomRef        OTH_Female_Het  OTH_Female_HomAlt       SAS_AF  SAS_HomRef      SAS_Het SAS_HomAlt     SAS_Male_AF     SAS_Male_HomRef SAS_Male_Het    SAS_Male_HomAlt SAS_Male_HemiAlt        SAS_Male_HemiAF        SAS_Female_AF   SAS_Female_HomRef       SAS_Female_Het  SAS_Female_HomAlt       PopMax_AF      InPop
1       12693   22331   10012   CNV     1000G   HGSV_4  0.5     0       904     0       0.5     0     459      0       NA      NA      0.5     0       445     0       0.5     0       300     0       0.5   0157     0       NA      NA      0.5     0       143     0       0.5     0       112     0       0.5   053      0       NA      NA      0.5     0       59      0       0.5     0       162     0       0.5   079      0       NA      NA      0.5     0       83      0       0.5     0       150     0       0.5   069      0       NA      NA      0.5     0       81      0       NA      NA      NA      NA      NA    NA       NA      NA      NA      NA      NA      NA      NA      NA      0.5     0       180     0     0.5      0       101     0       NA      NA      0.5     0       79      0       0.5     5

This is the SVAFotate command I ran:

svafotate annotate -v sample.cutesv.vcf \
-o sample.cutesv.gnomad.vcf \
-b SVAFotate_core_SV_popAFs.lift_to_hs1.bed \
-f 0.5 \
-a best EUR mf \
--cpu 16 \
-c 0 \
-s gnomAD \
-l 1000

And this is the error I get:

Traceback (most recent call last):
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/bin/svafotate", line 8, in <module>
    sys.exit(main())
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/lib/python3.8/site-packages/svafotate/__main__.py", line 41, in main
    args.func(parser, args)
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/lib/python3.8/site-packages/svafotate/svafotate_main.py", line 1118, in annotate
    write_max_values(sv_id,join_matches,["Male_AF","Male_Het","Male_HomAlt","Female_AF","Female_Het","Female_HomAlt"],req_sources,datas,header_cols,header_types,v)
  File "/nesi/nobackup/uoo03533/Cen/python_userbase/SVAfotate/lib/python3.8/site-packages/svafotate/annotation_utils.py", line 70, in write_max_values
    max_val = max(list(map(int,all_vals)))
ValueError: invalid literal for int() with base 10: 'None'

It did create an annotated bed file, but it was truncated (missing ~10K lines).

Thank you.

fakedrtom commented 1 year ago

I'm glad you have it working, but that last error is a bit peculiar. It is writing out the annotations, but appears to be halted because it encountered a 'None' in the list of values when trying to find a max. This might be a problem on my end that I'll need to check. Could you do me a favor and try running it without EUR and mf in the -a option and let me know if it runs to completion?

yusmiatiliau commented 1 year ago

Hi again, thanks for coming back so quickly. I just tried, and It did run to completion if I remove the EUR and mf from -a option. So, it has something to do with some "None" values in the supporting bed file? Is there any workaround if I still want to use AF from EUR population? Thanks

fakedrtom commented 1 year ago

Sounds like this is a problem on my end where I didn't properly account for 'None' in my lists. Let me see if I can't get this fixed by tomorrow.

In the meantime, if you are in a hurry, you should be able to take the VCF you just annotated and filter those SVs accordingly. Any SVs without any matches per SVAFotate will have a EUR AF of 0. You can then check any others that pass your filtering for EUR AFs by looking back at the input BED file. A bedtools intersect command may help if you have a lot of passing SVs to check.

I do suspect I'll be able to get a fix for this by tomorrow.

fakedrtom commented 1 year ago

I just posted what I think should fix the problem for you. Please update and give it a try. Let me know if you still get the same error.

Coincidentally, figuring this out revealed some other things in SVAFotate that I should address regarding the SVAFotate_core_SV_popAFs.GRCh38.bed.gz file and also how hemizygous regions are being handled so thanks for helping me identify those things too!

yusmiatiliau commented 1 year ago

Hi, yes, it did fix it. Thanks so much for your kind and prompt response!

fakedrtom commented 1 year ago

Great!