marbl / merqury

k-mer based assembly evaluation
Other
272 stars 19 forks source link

No resulting plots.. #82

Closed kwiyounghan closed 1 year ago

kwiyounghan commented 1 year ago

Hi,

I'm trying to use Merqury for assessing phased assembly produced by hifiasm using hifi reads and hic integrated data. But I don't get any plots after Merqury run, and searching in google and this issue page doesn't seem to help. (I've looked at issue #51 No plots generated #51 , but it doesn't contain any answers that work for me)

What I did, Step1. create meryl database There are two files from two hifi runs from same library so

meryl k=19 count threads=32 output AL556_194_1_hifi_reads_19.meryl ../raw/AL556_194_1_hifi_reads.fastq  
meryl k=19 count threads=32 output AL556_194_2_hifi_reads_19.meryl ../raw/AL556_194_2_hifi_reads.fastq    

Step 2. Merge

 meryl union-sum threads=32 output AL556_194_12_19_combined.meryl AL556_194_1_hifi_reads_19.meryl AL556_194_2_hifi_reads_19.meryl

Step3. Merqury

asm_hap1=../00_assembly/hifiasm/AL556_194_12_hic_hifiasm.asm.hic.hap1.p_ctg.fa
asm_hap2=../00_assembly/hifiasm/AL556_194_12_hic_hifiasm.asm.hic.hap2.p_ctg.fa

$MERQURY/merqury.sh AL556_194_12_19_combined.meryl $asm_hap1 $asm_hap2 AL556_194_12_hic_hifiasm.asm.hic_hap12_19

But in the end there are no plots generated. Information that I'm not sure relevant but from the other issue #51 might be:

Found 1 command tree. Number of 19-mers that are: unique 364045140 (exactly one instance of the kmer is in the input) distinct 416942646 (non-redundant kmer sequences in the input) present 672465051 (...) missing 274460964298 (non-redundant kmer sequences not in the input)

         number of   cumulative   cumulative     presence
          distinct     fraction     fraction   in dataset

frequency kmers distinct total (1e-6)


meryl statistics AL556_194_12_hic_hifiasm.asm.hic.hap2.p_ctg.meryl | head

Found 1 command tree. Number of 19-mers that are: unique 330019187 (exactly one instance of the kmer is in the input) distinct 358808421 (non-redundant kmer sequences in the input) present 537132274 (...) missing 274519098523 (non-redundant kmer sequences not in the input)

         number of   cumulative   cumulative     presence
          distinct     fraction     fraction   in dataset

frequency kmers distinct total (1e-6)



- I'm using merqury from a conda environment, both merqury and meryl version 1.3
- I'm attaching the log file AL556_194_12_hic_hifiasm.asm.hic_hap12_19.spectra-cn.log

[AL556_194_12_hic_hifiasm.asm.hic_hap12_19.spectra-cn.log](https://github.com/marbl/merqury/files/9490979/AL556_194_12_hic_hifiasm.asm.hic_hap12_19.spectra-cn.log)

cheers, 
kwi
arangrhie commented 1 year ago

Hello kwi,

It seems like the required packages aren't available on the conda version. :( The required packages missing are argparse, ggplot2, and scale. Ones the packages are installed, just executing the Rscript lines in the log should generate the plots.

I wonder if your cluster has an R module with no packages, which could cause this issue (relevant to #49). Try replacing this line in your conda: /gxfs_home/geomar/smomw426/.conda/envs/mercury/share/merqury/util/util.sh (line 15) to

echo 1

Thanks, Arang

kwiyounghan commented 1 year ago

HI Arang,

Thanks for your response! Changing the line to "echo 1" solved the problem!!

best, Kwi