Closed boucherufl closed 2 years ago
Hello Christina, could you double check if the path to meryl bin is in your $PATH? Seems like meryl is not found.
Hi Arang,
Thanks for your reply. I have been still struggling with getting it running properly. Rather than install it locally I have asked the Hipergator support to install it as a module.
I triple checked and meryl is installed correctly. When I type "meryl" the help comes up.
What is disconcerning is that when I write:
$MERQURY/merqury.sh manatee.meryl polished_sequences.fasta CB-test2
I get the following error in the log:
Can't interpret 'manatee.meryl': not a meryl command, option, or recognized input file.
Can't interpret 'polished_sequences.meryl': not a meryl command, option, or recognized input file.
Can you give me some insight?
Best,
Christina
Christina Boucher Associate Professor Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering University of Florida Gainesville, FL 32611 http://www.christinaboucher.com/ Google Scholarhttps://scholar.google.com/citations?user=wpPBcf4AAAAJ&hl=en&citsig=AMstHGQcx72PMDLXmo8GRH2-sYilrgTdjg
From: Arang Rhie @.> Sent: May 27, 2021 9:42 AM To: marbl/merqury @.> Cc: Boucher,Christina A @.>; Author @.> Subject: Re: [marbl/merqury] Error when running (#47)
[External Email]
Hello Christina, could you double check if the path to meryl bin is in your $PATH? Seems like meryl is not found.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marbl_merqury_issues_47-23issuecomment-2D849645272&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=Q_YlNWCwJuIilBlHQKaBTi6nbeC7V69BqYdS7U8kp3s&s=ge31QD3nX7EYrC_UVwsvaluYCYib19AEMnhLpV8-Hq4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMLLNFF7WLLT63WQESFPTLDTPZD3VANCNFSM45SKX4BQ&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=Q_YlNWCwJuIilBlHQKaBTi6nbeC7V69BqYdS7U8kp3s&s=q2YdZYBAAAsQQxcehhPamWxwom7VQF-EmA3i7CRLE30&e=.
Something is odd here. Could you double check the meryl version installed? Make sure v1.3 release version is installed, for both Merqury and Meryl.
Yes, both are installed. Both are versions 1.3, as they were installed a couple of weeks ago.
Christina Boucher Associate Professor Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering University of Florida Gainesville, FL 32611 http://www.christinaboucher.com/ Google Scholarhttps://scholar.google.com/citations?user=wpPBcf4AAAAJ&hl=en&citsig=AMstHGQcx72PMDLXmo8GRH2-sYilrgTdjg
From: Arang Rhie @.> Sent: June 1, 2021 12:33 PM To: marbl/merqury @.> Cc: Boucher,Christina A @.>; Author @.> Subject: Re: [marbl/merqury] Error when running (#47)
[External Email]
Something is odd here. Could you double check the meryl version installed? Make sure v1.3 release version is installed, for both Merqury and Meryl.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marbl_merqury_issues_47-23issuecomment-2D852266170&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=8-q_N4wb8_xZR4-HKo5PYcOBlkALnDk8KJR-XLyNqKI&s=zQ1OQHTLTuIDZUaygH9udZE3rtxV1RQ9kzpcvOvv9vI&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMLLNFB4XKTNZ4HFFOASXMTTQUDWXANCNFSM45SKX4BQ&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=8-q_N4wb8_xZR4-HKo5PYcOBlkALnDk8KJR-XLyNqKI&s=DuZeDtbGGgy216FKt9RZa6dgJoohtxxsWiZ35B2Kvi4&e=.
Hmm, ok, let's check if the manatee.meryl
is correctly built.
What do you get when running meryl statistics manatee.meryl | head
?
If that is giving a reasonable summary, try running spectra-cn yourself:
$MERQURY/eval/spectra-cn.sh manatee.meryl polished_sequences.fasta CB-test2
and let me know what the log says.
Found 1 command tree.
Number of 21-mers that are:
unique 1345124229 (exactly one instance of the kmer is in the input)
distinct 2255727146 (non-redundant kmer sequences in the input)
present 7479513837 (...)
missing 4395790783958 (non-redundant kmer sequences not in the input)
number of cumulative cumulative presence
distinct fraction fraction in dataset
frequency kmers distinct total (1e-6)
Christina Boucher Associate Professor Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering University of Florida Gainesville, FL 32611 http://www.christinaboucher.com/ Google Scholarhttps://scholar.google.com/citations?user=wpPBcf4AAAAJ&hl=en&citsig=AMstHGQcx72PMDLXmo8GRH2-sYilrgTdjg
From: Arang Rhie @.> Sent: June 1, 2021 12:46 PM To: marbl/merqury @.> Cc: Boucher,Christina A @.>; Author @.> Subject: Re: [marbl/merqury] Error when running (#47)
[External Email]
If that is giving a reasonable summary, try running spectra-cn yourself: $MERQURY/eval/spectra-cn.sh manatee.meryl polished_sequences.fasta CB-test2 and let me know what the log says.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marbl_merqury_issues_47-23issuecomment-2D852279228&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=2OccAjM51r20FI-lLbIqU6AY83wCYspnFoH58uX15To&s=akYzBii0ugonR6_VENIfqfcLlThr-PIrYT7gXtvip0k&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMLLNFCYVTJPAKJO6EPWCFTTQUFHZANCNFSM45SKX4BQ&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=2OccAjM51r20FI-lLbIqU6AY83wCYspnFoH58uX15To&s=27kz6Je7uQJecVLYzMxTulli8Pa_J9VpUDVtET2qgTU&e=.
Ok, seems like it is a 'reasonable' meryl database, except the number of kmers seem quite small.
How did you obtained the manatee.meryl? Was this from Illumina reads? What was the sequencing depth? The summary is saying there are 1.3 G unique out of 2.2G distinct k-mers, which seems too small for a 3~4 Gb genome. Wonder if this was obtained from the assembly?
The manatee.meryl was obtained from illumina data. I am uncertain about the sequencing depth as I am coming into the project late; i.e., assess the assembly quality to decide on the next steps.
I'd like to just get it running so I can report back to Adam and the other collaborators, even if the assembly quality is low.
Christina Boucher Associate Professor Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering University of Florida Gainesville, FL 32611 http://www.christinaboucher.com/ Google Scholarhttps://scholar.google.com/citations?user=wpPBcf4AAAAJ&hl=en&citsig=AMstHGQcx72PMDLXmo8GRH2-sYilrgTdjg
From: Arang Rhie @.> Sent: June 1, 2021 12:53 PM To: marbl/merqury @.> Cc: Boucher,Christina A @.>; Author @.> Subject: Re: [marbl/merqury] Error when running (#47)
[External Email]
Ok, seems like it is a 'reasonable' meryl database, except the number of kmers seem quite small.
How did you obtained the manatee.meryl? Was this from Illumina reads? What was the sequencing depth? The summary is saying there are 1.3 G unique out of 2.2G distinct k-mers, which seems too small for a 3~4 Gb genome. Wonder if this was obtained from the assembly?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marbl_merqury_issues_47-23issuecomment-2D852283299&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=QlOxngCto_P8J1dCEKt-bwp6mBDLpX8JvZGpFSfZwL8&s=1x1dyl78Klmc3vw9gkW7dZhoIyvCKX5W_pdjDrShKq8&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMLLNFCY4JZPJJEISV242BDTQUF7NANCNFSM45SKX4BQ&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=QlOxngCto_P8J1dCEKt-bwp6mBDLpX8JvZGpFSfZwL8&s=I8456RHFCALPhuBV4YnJym0pdoauGTo9EO8hT4e1kmM&e=.
Hipergator research staff installed. Using their installation, the problem seems to persist....
Christina Boucher Associate Professor Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering University of Florida Gainesville, FL 32611 http://www.christinaboucher.com/ Google Scholarhttps://scholar.google.com/citations?user=wpPBcf4AAAAJ&hl=en&citsig=AMstHGQcx72PMDLXmo8GRH2-sYilrgTdjg
From: Arang Rhie @.> Sent: June 1, 2021 12:53 PM To: marbl/merqury @.> Cc: Boucher,Christina A @.>; Author @.> Subject: Re: [marbl/merqury] Error when running (#47)
[External Email]
Ok, seems like it is a 'reasonable' meryl database, except the number of kmers seem quite small.
How did you obtained the manatee.meryl? Was this from Illumina reads? What was the sequencing depth? The summary is saying there are 1.3 G unique out of 2.2G distinct k-mers, which seems too small for a 3~4 Gb genome. Wonder if this was obtained from the assembly?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marbl_merqury_issues_47-23issuecomment-2D852283299&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=QlOxngCto_P8J1dCEKt-bwp6mBDLpX8JvZGpFSfZwL8&s=1x1dyl78Klmc3vw9gkW7dZhoIyvCKX5W_pdjDrShKq8&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMLLNFCY4JZPJJEISV242BDTQUF7NANCNFSM45SKX4BQ&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=QlOxngCto_P8J1dCEKt-bwp6mBDLpX8JvZGpFSfZwL8&s=I8456RHFCALPhuBV4YnJym0pdoauGTo9EO8hT4e1kmM&e=.
Unfortunately, I don't think the Merqury will give a reasonable analysis with the given .meryl as the sequencing depth seems too low. It looks like 1~3x; with no peak. Might be better to get back and check if there are more Illumina reads available.
For the spectra-cn like analysis, there should be a clear peak seen that distinguishes erroneous vs. true copy number k-mers.
meryl histogram manatee.meryl > manatee.hist
will give you the histogram, which the above can be estimated.
QV estimation may also fail and under represent the truth if the coverage is not able to cover the full assembly. What was the assembly size?
Do you have some test files that I can run to make sure this is an issue with the data and not the install / software?
Christina Boucher Associate Professor Computer & Information Science & Engineering Department Herbert Wertheim College of Engineering University of Florida Gainesville, FL 32611 http://www.christinaboucher.com/ Google Scholarhttps://scholar.google.com/citations?user=wpPBcf4AAAAJ&hl=en&citsig=AMstHGQcx72PMDLXmo8GRH2-sYilrgTdjg
From: Arang Rhie @.> Sent: June 1, 2021 1:11 PM To: marbl/merqury @.> Cc: Boucher,Christina A @.>; Author @.> Subject: Re: [marbl/merqury] Error when running (#47)
[External Email]
Unfortunately, I don't think the Merqury will give a reasonable analysis with the given .meryl as the sequencing depth seems too low. It looks like 1~3x; with no peak. Might be better to get back and check if there are more Illumina reads available.
For the spectra-cn like analysis, there should be a clear peak seen that distinguishes erroneous vs. true copy number k-mers. meryl histogram manatee.meryl > manatee.hist will give you the histogram, which the above can be estimated.
QV estimation may also fail and under represent the truth if the coverage is not able to cover the full assembly. What was the assembly size?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marbl_merqury_issues_47-23issuecomment-2D852295776&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=LTiWahBjZ9QKyrO2aHPAaDMc-X_I5SVlfFZzKvmtQZQ&s=3vYoiqGGF0e96gmVCTR0K6J7ENE2g_2C0OI4nG-UOwA&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMLLNFCAD763MBY5V5ASA23TQUIFHANCNFSM45SKX4BQ&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=CbX1Lan-HgWvMdp4nYmontxH8QWFbCg-2J8XUE1wMOQ&m=LTiWahBjZ9QKyrO2aHPAaDMc-X_I5SVlfFZzKvmtQZQ&s=4AQGPjhbB6NCw7hy9Jo7HzAElC3y-4gkhRuuftHuqJo&e=.
Try this: https://github.com/marbl/merqury#example Let me know if the same error persists!
I ran Meryl without issues on all fastq files + merged the resulting files. After running merqury as follows:
$MERQURY/merqury.sh ./manatee.meryl /blue/conesa/share/manatee_assembly/ont_polishing/polished_sequences.fasta CBtest
It runs without stdout error but the files are empty and there are errors in the log file. See below:
/home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 88 Copy = 3 ..
Copy = 4 ..
Copy >4 ..
Copy numbers in k-mers found only in asm
No asm2_fa given. Done.
polished_sequences only
Write output
Get asm only for spectra-asm
Plot CBtest.spectra-asm.hist
Rscript /home/christinaboucher/merqury-1.3/plot/plot_spectra_cn.R -f CBtest.spectra-asm.hist -o CBtest.spectra-asm -z CBtest.dist_only.hist [1] "x_max: "
Clean up
Done! cannot remove ‘read.k.polished_sequences.3.meryl’: No such file or directory /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 88: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 89: meryl: command not found rm: cannot remove ‘read.k.polished_sequences.4.meryl’: No such file or directory /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 95: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 96: meryl: command not found rm: cannot remove ‘read.k.polished_sequences.gt4.meryl’: No such file or directory /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 102: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 103: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 104: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 105: -: syntax error: operand expected (error token is "-") /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 117: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 121: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 122: meryl: command not found /home/christinaboucher/merqury-1.3/eval/spectra-cn.sh: line 125: meryl: command not found Loading required package: argparse Loading required package: ggplot2 Loading required package: scales Error in
[.data.frame
(dat_0, , 3) : undefined columns selected Calls: spectra_cn_plot -> [ -> [.data.frame In addition: Warning message: In max(dat[dat[, 1] != "read-total" & dat[, 1] != "read-only" & : no non-missing arguments to max; returning -Inf Execution halted rm: cannot remove ‘polished_sequences.0.meryl’: No such file or directory rm: cannot remove ‘read.k.polished_sequences.0.meryl’: No such file or directory rm: cannot remove ‘read.k.polished_sequences.meryl’: No such file or directory rm: cannot remove ‘manatee.gt0.meryl’: No such file or directoryInsight into correcting would be great.
Thanks. Christina