AntoineHo / CircosAlignmentPlotter

Converts a part of an alignment (.PAF perhaps others sometimes) to a Circos image using BED and fasta files.
Apache License 2.0
3 stars 1 forks source link

IndexError: list index out of range #3

Open Asrix opened 1 day ago

Asrix commented 1 day ago

Hi,

I would really like to make this code work, but I am getting IndexError: list index out of range at line 160. I am using python 3.10.8.

My command is:python p2c.py /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/SnowPeng.minimap2.paf /nobackup/qtwh28/SnowPet/Genome/FullAssembly/SNPE_L3_nomito_purged2.np2_HiC_2_scaffolds_final.fa /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/Spheniscus_humboldti.fa /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/input.bed ./Snow_Peng_p2c/ --templates ./

All of my input files look to be properly formatted and complete. The template files haven't been modified, but if I don't include them then I get this error "Exception: ERROR: Invalid template file directory".

WARNING: Directory already exists Query: /nobackup/qtwh28/SnowPet/Genome/FullAssembly/SNPE_L3_nomito_purged2.np2_HiC_2_scaffolds_final.fa Reference: /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/Spheniscus_humboldti.fa Targets: /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/input.bed PAF: /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/SnowPeng.minimap2.paf Outdir: /nobackup/qtwh28/SnowPet/Genome/PenguinGenome/Snow_Peng_p2c Templates: /nobackup/qtwh28/SnowPet/Genome/PenguinGenome s: Spheniscus_humboldti.fa Traceback (most recent call last): File "/nobackup/qtwh28/SnowPet/Genome/PenguinGenome/p2c.py", line 472, in main() File "/nobackup/qtwh28/SnowPet/Genome/PenguinGenome/p2c.py", line 463, in main targets_to_plot = readTargets(iFH.targets, correspondance, contigs_lengths) File "/nobackup/qtwh28/SnowPet/Genome/PenguinGenome/p2c.py", line 160, in readTargets start = int(s[1]) IndexError: list index out of range

Any ideas on what I should fix?

AntoineHo commented 22 hours ago

Hi, This is some old code, I should probably clean it up... It looks like the problem comes from the BED file formating, probably an issue of spaces vs. tabulation between columns. You could try using sed to fix it: sed 's/ \+ /\t/g' targets.bed > targets.tab.bed

Please let me know if the problem persists using a tab-separated bed file. Cheers,

Antoine

Asrix commented 11 hours ago

Hi Antoine,

Thanks! Using sed didn't fix it, but since you thought the BED file was probably the problem, I rebuilt the BED file with bamToBed and that works now. It is throwing an error on the actual plot, but I think I might be able to figure that out: error on line 3 at column 1: Start tag expected, '<' not found

Thanks for the quick reply!

AntoineHo commented 6 hours ago

Hi, I found one issue with the original code, not plotting some query contigs, I made a quick fix, but I would need time to make sure that it is properly working. As I said, this code requires a big cleanup.

I also added a minimal version using only a .paf file to make the circos plot. It will not crop chromosomes or hide ideograms from the query or target so you would need to remove alignments directly in the input .paf file, but it is also much easier to use: python p2c_min.py aln.paf outdir

Hope this can help you