PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
243 stars 44 forks source link

visualize.py issue #660

Closed mariachiara-github closed 4 months ago

mariachiara-github commented 4 months ago

Running the visualize.py script I got this error Error message gene1_start = min([x[0][1][0][0] for x in list(alignments.values()) if x[0][0] == breakpoints[0][0]]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: min() arg is an empty sequence

This is the code I run:

python3 visualize_fusion.py \
    -o fusion_browser_shot.png \
    -a gencode.v38.annotation.gtf \
    -f prefix.breakpoints.groups.bed \
    -b prefix.mapped.bam

I took as input the gtf file used for the pbfusion run, the prefix.breakpoints.groups.bed generated after the pbfusion run, and the aligned Iso-Seq HiFi data in a BAM file format (used also in the pbfusion). Maybe is this last BAM file that I used which is not the right one..

velociroger-pb commented 4 months ago

I think I know what the issue is and should have a fix soon. Would you be able to send me the breakpoint that you're trying to plot and the associated reads in a filtered bam please?

mariachiara-github commented 4 months ago

I think I know what the issue is and should have a fix soon. Would you be able to send me the breakpoint that you're trying to plot and the associated reads in a filtered bam please?

Thank you so much for your reply! These are the 2 files you requested, let me know if there are some problems (these are txt files, I could not upload BED or BAM files apparently..). Also, I forgot to mention that I'm using the gencode.v38.chr_patch_hapl_scaff.annotation.gtf as GTF file

breakpoint.bed.txt reads_filtered.bam.txt

zeeev commented 4 months ago

We should have a release to fix this issue in the next few days, @meryyr

tky199996 commented 4 months ago

我想我知道问题是什么并且应该尽快解决。您能否尝试较差的断点以及过滤后的 bam 中的相关读数发送给我?

非常感谢你的回复! 这是您请求的2个文件,如果有问题请告诉我(这些是txt文件,我显然无法上传BED或BAM文件..)。另外,我记得想起我正在使用gencode .v38.chr_patch_hapl_scaff.annotation.gtf 作为 GTF 文件

Breakpoint.bed.txtreads_filtered.bam.txt Hello, I recently encountered the same problem as you, has your problem been solved?

mariachiara-github commented 4 months ago

We should have a release to fix this issue in the next few days, @meryyr

Perfect, thank you so much!

mariachiara-github commented 4 months ago

我想我知道问题是什么并且应该尽快解决。您能否尝试较差的断点以及过滤后的 bam 中的相关读数发送给我?

非常感谢你的回复! 这是您请求的2个文件,如果有问题请告诉我(这些是txt文件,我显然无法上传BED或BAM文件..)。另外,我记得想起我正在使用gencode .v38.chr_patch_hapl_scaff.annotation.gtf 作为 GTF 文件 Breakpoint.bed.txtreads_filtered.bam.txt Hello, I recently encountered the same problem as you, has your problem been solved?

Hello, the problem has not been solved yet, but should be in the days

velociroger-pb commented 4 months ago

@meryyr would you be able to send the header for your reads please? Just send the output of samtools view -H reads_filtered.bam. I'm having trouble reproducing your breakpoint + pysam won't take the file without a header

velociroger-pb commented 4 months ago

@tky199996 I've just released v0.4.1 and updated the visualization script, which should hopefully fix your error.

@meryyr Your fusion is a bit more complex (there's an N/A in the gene list), so it might take a bit of fiddling to get the visualization script working. Once you provide me with the header for your alignment file, I'll be able to test and see what modifications need to be made such that you can visualize your fusion. It should be noted that the visualization script doesn't support complex fusions (>2 genes).

tky199996 commented 4 months ago

@tky199996 I've just released v0.4.1 and updated the visualization script, which should hopefully fix your error.

@meryyr Your fusion is a bit more complex (there's an N/A in the gene list), so it might take a bit of fiddling to get the visualization script working. Once you provide me with the header for your alignment file, I'll be able to test and see what modifications need to be made such that you can visualize your fusion. It should be noted that the visualization script doesn't support complex fusions (>2 genes). Hello, thank you for your help. Thank you very much. I wonder what time it is over there now? I tried a new script, but the problem still occurred. I will now upload my input file. Can you help me see if it is a file format problem? Thank you again. image I just found that my bam file cannot be transferred. I uploaded a screenshot of the title of my bam file. I hope this will work.

tky199996 commented 4 months ago

image image Then the two above are my gtf file and the bed file without title line obtained using pbfusion. Thank you for your help and hope we can resolve this issue successfully

tky199996 commented 4 months ago

image At the same time, I also noticed that there are N/A identifiers in some files. What does this mean? Is it an unknown gene?

mariachiara-github commented 4 months ago

@tky199996 I've just released v0.4.1 and updated the visualization script, which should hopefully fix your error.

@meryyr Your fusion is a bit more complex (there's an N/A in the gene list), so it might take a bit of fiddling to get the visualization script working. Once you provide me with the header for your alignment file, I'll be able to test and see what modifications need to be made such that you can visualize your fusion. It should be noted that the visualization script doesn't support complex fusions (>2 genes).

Sorry for the late reply, and thank you for the help! the zip file attached contains the header of the aligned reads (reads aligned with the pbmm2 tool), I hope this is the output you need! Please let me know if you need something else :) header_aligned.zip

mariachiara-github commented 4 months ago

image At the same time, I also noticed that there are N/A identifiers in some files. What does this mean? Is it an unknown gene?

yes, I think that N/A means that is a segment not matching known genes

velociroger-pb commented 4 months ago

@tky199996 the visualization script update is meant to be used with pbfusion v0.4.1 (it contains a bugfix for getting genes and their IDs ordered 5->3 in the bed file). Did you rerun with the new binary?

@meryyr Thanks for the header - I'll see if I can get this working

velociroger-pb commented 4 months ago

@meryyr in looking at your fusion I discovered an edge case that I need to fix. I'll close this for now until I have a fix (maybe next week).

mariachiara-github commented 3 months ago

@meryyr in looking at your fusion I discovered an edge case that I need to fix. I'll close this for now until I have a fix (maybe next week).

Okay thank you so much! Let me know if you have some news :)