marbl / merqury

k-mer based assembly evaluation
Other
272 stars 19 forks source link

Failed to Open Assembly File in Merqury #118

Closed XingzhengLee closed 5 months ago

XingzhengLee commented 5 months ago

Hello,

Thanks for your excellent tool! I am currently using Merqury to assess a trio-assembled genome. However, it seems that the tool is not recognizing the assembly file name correctly. My genome file ends with ".fa", but Merqury automatically appends another ".fasta" suffix and raises an error indicating its inability to locate the file and build the corresponding index. I suspect the subsequent R error is also related to the missing index.

Command:

merqury.sh child.meryl maternal.hapmer.meryl paternal.hapmer.meryl trio.asm.dip.hap1.p_ctg.fa trio.asm.dip.hap2.p_ctg.fa trio

Log:

read: child.meryl

Haplotype dbs provided.
Running Merqury in trio mode...

hap1: maternal.hapmer.meryl
hap2: paternal.hapmer.meryl
asm1: trio.asm.dip.hap1.p_ctg.fa
asm2: trio.asm.dip.hap2.p_ctg.fa
out : trio

Get spectra-cn plots and QV stats

Get blob plots

Get haplotype specfic spectra-cn plots

Get phase blocks

Get block N plots
No modules available..
# Generate trio.asm.dip.hap1.p_ctg.fa.fasta.fai
[E::fai_build3_core] Failed to open the file trio.asm.dip.hap1.p_ctg.fa.fasta
[faidx] Could not build fai index trio.asm.dip.hap1.p_ctg.fa.fasta.fai

*** # Found trio.asm.dip.hap1.p_ctg.fa.gaps.bed ***

# No gaps found. This is a contig set.
awk: fatal: cannot open file `trio.asm.dip.hap1.p_ctg.fa.fasta.fai' for reading (没有那个文件或目录)

# Generate trio.asm.dip.hap2.p_ctg.fa.fasta.fai
[E::fai_build3_core] Failed to open the file trio.asm.dip.hap2.p_ctg.fa.fasta
[faidx] Could not build fai index trio.asm.dip.hap2.p_ctg.fa.fasta.fai

*** # Found trio.asm.dip.hap2.p_ctg.fa.gaps.bed ***

# No gaps found. This is a contig set.
awk: fatal: cannot open file `trio.asm.dip.hap2.p_ctg.fa.fasta.fai' for reading (没有那个文件或目录)

# Convert trio.trio.asm.dip.hap1.p_ctg.fa.100_20000.phased_block.bed to sizes
 Result saved as trio.trio.asm.dip.hap1.p_ctg.fa.100_20000.phased_block.sizes

# Plot trio.trio.asm.dip.hap1.p_ctg.fa.100_20000.phased_block.bed
Rscript /home/lixingzheng/miniconda3/envs/merqury/share/merqury/plot/plot_block_N.R -b trio.trio.asm.dip.hap1.p_ctg.fa.100_20000.phased_block.sizes -c trio.trio.asm.dip.hap1.p_ctg.fa.contig.sizes  -o trio.trio.asm.dip.hap1.p_ctg.fa 
载入需要的程辑包:argparse
载入需要的程辑包:ggplot2
载入需要的程辑包:scales
Error in read.table(dat, header = F) : 输入中没有多出的行
Calls: block_n -> attach_n -> read.table
停止执行

# Convert trio.trio.asm.dip.hap2.p_ctg.fa.100_20000.phased_block.bed to sizes
 Result saved as trio.trio.asm.dip.hap2.p_ctg.fa.100_20000.phased_block.sizes

# Plot trio.trio.asm.dip.hap2.p_ctg.fa.100_20000.phased_block.bed
Rscript /home/lixingzheng/miniconda3/envs/merqury/share/merqury/plot/plot_block_N.R -b trio.trio.asm.dip.hap2.p_ctg.fa.100_20000.phased_block.sizes -c trio.trio.asm.dip.hap2.p_ctg.fa.contig.sizes  -o trio.trio.asm.dip.hap2.p_ctg.fa 
载入需要的程辑包:argparse
载入需要的程辑包:ggplot2
载入需要的程辑包:scales
Error in read.table(dat, header = F) : 输入中没有多出的行
Calls: block_n -> attach_n -> read.table
停止执行

** merqury done **

(Apologies for the Chinese characters in this log file; the admin didn't switch it to English, but it doesn't interfere with understanding)

Do I need to rename the assembly file from "*.fa" to ".fasta" to resolve this issue?

I appreciate your time and assistance!

arangrhie commented 5 months ago

Hello @XingzhengLee, the latest git commit version should fix this. Can you clone this repository, and re run Merqury with the latest code?

XingzhengLee commented 5 months ago

Thanks for your advice, I renamed the genome file from ".fa" to ".fasta", it also worked! Thank you!