ucl-pathgenomics / HaROLD

Haplotype Reconstruction of Longitudinal Deep sequencing data
MIT License
6 stars 1 forks source link

Error when run HaROLD #2

Closed liaoherui closed 3 years ago

liaoherui commented 3 years ago

Hi, I have met a problem when I run HaROLD according to the manual.

When I run the command below,

java -cp HAROLD/lib/htsjdk-unspecified-SNAPSHOT.jar:\
HaROLD/lib/picocli-4.1.2.jar: \
HaROLD/lib/pal-1.5.1.jar: \
HaROLD/lib/cache2k-all-1.0.2.Final.jar: \
HaROLD/lib/commons-math3-3.6.1.jar: \
HaROLD/jar/MakeReadCount.jar makereadcount.MakeReadCount file.bam

I got the error ,

Error: Could not find or load main class HaROLD.lib.pal-1.5.1.jar:
Caused by: java.lang.ClassNotFoundException: HaROLD.lib.pal-1.5.1.jar:

Do you have any idea about this? Thanks a lot!

cristina86cristina commented 3 years ago

Hello, I will look into your issue as soon as I can - can you let me know which version of Java you are using? Can you also add info about your mapping software? (i.e. bwa , bbmap...) Thanks! Cristina

liaoherui commented 3 years ago

Hi, Cristina, thanks for your prompt reply! The version of Java I used is "11.0.1"

I used bwa (version: 0.7.17-r1188) to map reads to the reference and samtools (version: 1.10) to get the ".bam" file. The command I used is shown below.

bwa mem  ref.fasta  test_1.fq test_2.fq > test.sam
samtools view -S -b  test.sam > test.bam
samtools view -h -G69 test.bam | samtools view -h -G133 > file.bam

Thanks!

cristina86cristina commented 3 years ago

Thanks for the details! Can you try to run:

java -cp HaROLD/lib/htsjdk-unspecified-SNAPSHOT.jar:\
HaROLD/lib/picocli-4.1.2.jar:lib/pal-1.5.1.jar:\
HaROLD/lib/cache2k-all-1.0.2.Final.jar:\
HaROLD/lib/commons-math3-3.6.1.jar:\
HaROLD/jar/MakeReadCount.jar \
makereadcount.MakeReadCount file.bam

I just noticed that when the manual has been saved, some spaces have been added by mistake! Let me know if it works, I will edit the manual if so!

liaoherui commented 3 years ago

This works for me! Thanks! However, when I continue to run the tool according to the manual. There are some new errors in Step-2. The command I used is:

# Step-1 -> Runs well, no problem.
java -jar HaROLD/jar/Cluster_RG/dist/Cluster_RG.jar \
--count-file sample.txt --haplotypes 2 --alpha-frac 0.5 --gamma-cache 10000 \
-H -L --threads 4 -p  Step1_results
# Step-2
java -cp HRPOLD/lib/htsjdk-unspecified-SNAPSHOT.jar:\
HaROLD/lib/picocli-4.1.2.jar:\
HaROLD/lib/pal-1.5.1.jar:\
HaROLD/lib/commons-math3-3.6.1.jar:\
HaROLD/lib/cache2k-all-1.0.2.Final.jar:\
HaROLD/lib/flanagan.jar:\
HaROLD/jar/RefineHaplotypes.jar refineHaplotypes.RefineHaplotypes \
-t sample2 --bam file.bam \
--baseFreq Step1_results.lld --refSequence ref/ref.fasta \
--hapAlignment Step1_resultsHaplo.fasta  --iterate

and the error info is

Command arguments:  -t sample2 --bam file.bam --baseFreq Step1_results.lld --refSequence ref/ref.fasta --hapAlignment Step1_resultsHaplo.fasta --iterate
Initialising hapfreq with even values
Exception in thread "main" java.lang.NoClassDefFoundError: htsjdk/samtools/SamReaderFactory
        at refineHaplotypes.RefineHaplotypes.readFromFile(RefineHaplotypes.java:829)
        at refineHaplotypes.RefineHaplotypes.readData(RefineHaplotypes.java:748)
        at refineHaplotypes.RefineHaplotypes.run(RefineHaplotypes.java:127)
        at refineHaplotypes.RefineHaplotypes.main(RefineHaplotypes.java:118)
Caused by: java.lang.ClassNotFoundException: htsjdk.samtools.SamReaderFactory
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
        ... 4 more
cristina86cristina commented 3 years ago

I cannot replicate your error, the only thing I can see it's a typo in "HRPOLD/lib/htsjdk-unspecified-SNAPSHOT.jar:\" should be "HaROLD/lib/htsjdk-unspecified-SNAPSHOT.jar:\" (will check the manual to fix that). if that doesn't solve the issue I can try to run the refinement with your data, if you are ok to share it?

liaoherui commented 3 years ago

That fixes my problem! Thanks a lot! It runs successfully now! One additional question is I have two haplotype strains in the tested sequencing data. However, the tool only returns one haplotype sequence after Step-2. But I can find two haplotype sequences in the output of Step-1. In this case, which result is more reliable?

cristina86cristina commented 3 years ago

Great! Step-2 (the refinement) is generally more reliable because it takes the results from step-1 and merge haplotypes that are not independent. Since you mentioned you have two strains there, can I ask you which virus are you working on? Also are your data sequenced in Illumina? Our tool has been developed and tested with Illumina reads (as described in the paper). Do you have longitudinal samples?

liaoherui commented 3 years ago

Thanks for your prompt reply. I am working on the COVID-19. My dataset is simulated and should be Illumina reads. Then I will take the output of Step-2 as my final result!