Parsoa / SVDSS

Improved structural variant discovery in accurate long reads using sample-specific strings (SFS)
MIT License
42 stars 4 forks source link

Segmentation fault (core dumped) error on v1.0.4 #14

Open nirvana693 opened 1 year ago

nirvana693 commented 1 year ago

I am running into seg fault while running "smooth" similar to https://github.com/Parsoa/SVDSS/issues/12. I have installed SVDSS through conda and also checked using binary distribution. In both cases, same result (Seg fault). It worked on one sample but fails on other (checked on 2 so far).

SVDSS, Structural Variant Discovery from Sample-specific Strings.
smooth
[I] SVDSS_Contig
[I] Loading reference genome from chm13v2.0.fa..
[I] Extracted chr1 with 248387328 bases.
[I] Extracted chr2 with 242696752 bases.
[I] Extracted chr3 with 201105948 bases.
[I] Extracted chr4 with 193574945 bases.
[I] Extracted chr5 with 182045439 bases.
[I] Extracted chr6 with 172126628 bases.
[I] Extracted chr7 with 160567428 bases.
[I] Extracted chr8 with 146259331 bases.
[I] Extracted chr9 with 150617247 bases.
[I] Extracted chr10 with 134758134 bases.
[I] Extracted chr11 with 135127769 bases.
[I] Extracted chr12 with 133324548 bases.
[I] Extracted chr13 with 113566686 bases.
[I] Extracted chr14 with 101161492 bases.
[I] Extracted chr15 with 99753195 bases.
[I] Extracted chr16 with 96330374 bases.
[I] Extracted chr17 with 84276897 bases.
[I] Extracted chr18 with 80542538 bases.
[I] Extracted chr19 with 61707364 bases.
[I] Extracted chr20 with 66210255 bases.
[I] Extracted chr21 with 45090682 bases.
[I] Extracted chr22 with 51324926 bases.
[I] Extracted chrX with 154259566 bases.
[I] Extracted chrY with 62460029 bases.
[I] Extracted chrM with 16569 bases.
[I] Loading first batch..
Segmentation fault (core dumped)

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities model name : Intel(R) Xeon(R) Platinum 8268 CPU @ 2.90GHz I am running the job on a HPC-large memory node with 6TB RAM. BAM size of 25MB

Please let me know if you need more info

ldenti commented 1 year ago

Hi, how many alignments the bam(s) contain? Both the one that succeeded and the one that didn't.

Moreover, would it be possible for you to share the .bam?

nirvana693 commented 1 year ago

Everything is mapped. The data was mapped using pbmm2 (internally uses minimap2). These are hifiasm assembled contigs which I am using as pseudo reads.

21249 + 0 in total (QC-passed reads + QC-failed reads)
7520 + 0 primary
0 + 0 secondary
13729 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
21249 + 0 mapped (100.00% : N/A)
7520 + 0 primary mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Is there a link which I can use to upload the bam?

ldenti commented 1 year ago

mmm I see. Maybe something goes off since these are not "real" HiFi reads.. But I have to check them..

If github doesn't allow you to upload the bam here (I don't remember the file size limit), can you use an hosting service like google drive?

ldenti commented 1 year ago

Hi @nirvana693, any news on this? I had some spare time and I tried to build contigs with hifiasm, align them with pbmm2 and then smooth them. Unfortunately, I couln't replicate your error. I suspect it's too data dependent..

One test you can do - in case you are still working on this - is to compile and try the lowcov branch. It's the development branch where I updated the smoothing and fixed some bugs.

Best,

ldenti commented 1 year ago

I merged the lowconv branch. v1.0.5 is now out and contains all the improvements and bug fixing of that branch. Please check it out in case. It also contains a bug fix for very low weighted clusters (such those you could obtain while calling from contigs and not reads).