bluenote-1577 / flopp

flopp is a software package for single individual haplotype phasing of polyploid organisms from long read sequencing.
35 stars 7 forks source link

no output generated after reading inputs #5

Closed XuanlinWu closed 2 years ago

XuanlinWu commented 2 years ago

Hi, flopp developer. I got a problem about outputs when running flopp. After Reading inputs, the job stopped but the results.txt and the partition results were not generated. The code is: flopp -t 52 -b 1m.aln.sorted.p.bam -c 1m.g.vcf.gz -p 6 -o floppres.txt -P flopppartition. The output log is: Reading inputs (BAM/VCF/frags). Time taken reading inputs 2679.259520149s. Since there is no error message, I could not figure out why flopp did not generate outputs properly. Any ideas?

bluenote-1577 commented 2 years ago

Hi xuanlin,

Thanks for the issue. I'll do some testing and get back to you soon.

Here's some quick thoughts:

  1. I would recommend subsampling the reads while testing only on the first 1m bases on your genome because reading the bam file can take a while.
  2. It looks like the bam file was successfully read and nothing was output. Can you double check your vcf file to see if the contig had variants called?
  3. Reads that are supplementary don't get used
  4. I never tested on gzipped files, so maybe try unzipping it
XuanlinWu commented 2 years ago

Thanks for your reply. I will go through and check these problems.

bluenote-1577 commented 2 years ago

Hi Xuanlin,

I tested again; It looks like gzipped files can be read no problem so that is not the issue.

I was able to recreate your output when none of my reads in my BAM file overlapped any SNPs. So I would double check if your VCF file is malformed and make sure that your reads are overlapping the SNPS in your VCF file.

XuanlinWu commented 2 years ago

Thanks, I used bcftools to check the vcf file and confirmed that there is no overlapping information in the vcf file.