yoshihikosuzuki / ClassPro

A K-mer classifier for HiFi reads
GNU General Public License v3.0
8 stars 0 forks source link

Segmentation fault (core dumped) occurs due to unknown reasons #6

Open xujialupaoli opened 2 months ago

xujialupaoli commented 2 months ago

Thank you very much for providing such a useful tool! I used simulated diploid potato data and ran classpro. I don't know why the program reported an error. I used two potato haplotypes, the first 5M genome of chromosome 1, and simulated 60×hifi data. Then I used the following code:

FastK -v -k40 -t46 -p potato_5M_2haps.fq 2>&1 |tee log_fastk


$ "/home/jialusoftware/ClassPro/bin/ClassPro"  -v potato_5M_2haps
Info about inputs:
    # of sequence files   = 1
    First (path,root,ext) = (., potato_5M_2haps, .fq)
    FASTK outputs' root   = ./potato_5M_2haps
    Output .class file    = ./potato_5M_2haps.class
    Temp dir path         = /home/work//hapmer_5M_2haps/
    Total # of reads      = 16342
    # of reads per thread = 4086
Global histogram inspection:
    Tallest peak count    = 26 (# of k-mers = 9666852)
    Estimated (H,D) cov   = (26,52)
    Estimated R-threshold = 88
Error model not specified. Using the default error model.
Classifying 40-mers...
Segmentation fault (core dumped)

I can see that there are some *.class.*files, but when I run it with the simulation data of E. coli before, there is no error, and finally a *.class file is output. I don't know why this is. Please help me share it. Looking forward to your reply!


 ls -lht
total 586M
-rw-rw-r-- 1 jialu jialu 8.8M Jul  1 09:33 potato_5M_2haps.class.1
-rw-rw-r-- 1 jialu jialu 904K Jul  1 09:33 potato_5M_2haps.class.3
-rw-rw-r-- 1 jialu jialu 2.0M Jul  1 09:33 potato_5M_2haps.class.2
-rw-rw-r-- 1 jialu jialu 328K Jul  1 09:33 potato_5M_2haps.class.4
-rw-rw-r-- 1 jialu jialu 2.7K Jul  1 09:33 log_fastk
-rwxrwxr-x 1 jialu jialu    8 Jul  1 09:33 potato_5M_2haps.prof
-rwx------ 1 jialu jialu 513K Jul  1 09:33 potato_5M_2haps.ktab
-rwxrwxr-x 1 jialu jialu 257K Jul  1 09:33 potato_5M_2haps.hist
-rw-rw-r-- 1 jialu jialu 573M Jul  1 09:31 potato_5M_2haps.fq
yoshihikosuzuki commented 1 month ago

Thanks for the report! And very sorry for the late reply.

I strongly suspect the problem is the -t46 option you specified in the FastK command before running ClassPro, which means you discard every k-mer occurring less than 46 times in the input read dataset (Maybe what you really want was -T, the number of CPU cores?). The input k-mer count read profile for ClassPro needs to be generated with -t1 (keeping all k-mer counts) in FastK. Please try this if time still permits.

Best, Yoshi