Open ghost opened 4 years ago
@alekseyzimin any idea? We are still trying to figure it out. We keep getting the same errors.
Hello,
The default for -t is likely 1. Not a good setting, you should use the number of cores (16 or 32?). Your first run failed because -i and -m options need numeric values.
/opt/MaSuRCA-3.3.4/bin/chromosome_scaffolder.sh -r ./GCF_000005245.1_dvir_caf1_genomic.fna -q ./assembly.fasta.fixed -t 32 -i 95 -m 100000
should work. --Aleksey
Dear @alekseyzimin,
thank you for your response. I run the scaffolder with the following settings:
/opt/MaSuRCA-3.3.4/bin/chromosome_scaffolder.sh -r ./GCF_000005245.1_dvir_caf1_genomic.fna -q ./assembly.fasta.fixed -t 16 -i 95 -m 100000 -v -s ./SRR7167958_1.fastq.gz -cl 3 -ch 64
and the output was:
+ shift
+ [[ 6 > 0 ]]
+ key=-s
+ case $key in
+ READS=SRR7167958_1.fastq.gz
+ shift
+ shift
+ [[ 4 > 0 ]]
+ key=-cl
+ case $key in
+ COV_THRESH=3
+ shift
+ shift
+ [[ 2 > 0 ]]
+ key=-ch
+ case $key in
+ REP_COV_THRESH=64
+ shift
+ shift
+ [[ 0 > 0 ]]
++ basename GCF_000005245.1_dvir_caf1_genomic.fna
+ REF_CHR=GCF_000005245.1_dvir_caf1_genomic.fna
++ basename assembly.fasta.fixed
+ HYB_CTG=assembly.fasta.fixed.split
+ HYB_POS=assembly.fasta.fixed.split.posmap
+ rm -rf .rerun
+ '[' '!' -s assembly.fasta.fixed.split ']'
+ log 'Splitting query scaffolds into contigs'
++ date
+ dddd='Thu Nov 14 16:19:33 EET 2019'
+ echo -e '\e[0;32m[Thu Nov 14 16:19:33 EET 2019]\e[0m Splitting query scaffolds into contigs'
[Thu Nov 14 16:19:33 EET 2019] Splitting query scaffolds into contigs
+ /opt/MaSuRCA-3.3.4/bin/splitFileAtNs assembly.fasta.fixed 1
+ touch .rerun
+ '[' '!' -s assembly.fasta.fixed.split.posmap ']'
+ log 'Mapping reads to query contigs'
++ date
+ dddd='Thu Nov 14 16:19:34 EET 2019'
+ echo -e '\e[0;32m[Thu Nov 14 16:19:34 EET 2019]\e[0m Mapping reads to query contigs'
[Thu Nov 14 16:19:34 EET 2019] Mapping reads to query contigs
+ /opt/MaSuRCA-3.3.4/bin/../CA8/Linux-amd64/bin/blasr -nproc 16 -bestn 1 SRR7167958_1.fastq.gz assembly.fasta.fixed.split
+ awk '{if(($11-$10)/$12>0.75){if($4==0) print $1" "substr($2,4)" "$7" "$8" f"; else print $1" "substr($2,4)" "$9-$8" "$9-$7" r"}}'
+ sort -nk2 -k3n -S 10%
[INFO] 2019-11-14T16:19:34 [blasr] started.
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
Should I change anything else in my options?
Ah, here is the problem-- your reads are gzipped and thus blasr fails. Unzip them and it will work.
On Thu, Nov 14, 2019 at 9:35 AM Marios Gavrielatos notifications@github.com wrote:
Dear @alekseyzimin https://github.com/alekseyzimin, thank you for your response. I run the scaffolder with the following settings: /opt/MaSuRCA-3.3.4/bin/chromosome_scaffolder.sh -r ./GCF_000005245.1_dvir_caf1_genomic.fna -q ./assembly.fasta.fixed -t 16 -i 95 -m 100000 -v -s ./SRR7167958_1.fastq.gz -cl 3 -ch 64 and the output was:
- shift
- [[ 6 > 0 ]]
- key=-s
- case $key in
- READS=SRR7167958_1.fastq.gz
- shift
- shift
- [[ 4 > 0 ]]
- key=-cl
- case $key in
- COV_THRESH=3
- shift
- shift
- [[ 2 > 0 ]]
- key=-ch
- case $key in
- REP_COV_THRESH=64
- shift
- shift
- [[ 0 > 0 ]] ++ basename GCF_000005245.1_dvir_caf1_genomic.fna
- REF_CHR=GCF_000005245.1_dvir_caf1_genomic.fna ++ basename assembly.fasta.fixed
- HYB_CTG=assembly.fasta.fixed.split
- HYB_POS=assembly.fasta.fixed.split.posmap
- rm -rf .rerun
- '[' '!' -s assembly.fasta.fixed.split ']'
- log 'Splitting query scaffolds into contigs' ++ date
- dddd='Thu Nov 14 16:19:33 EET 2019'
- echo -e '\e[0;32m[Thu Nov 14 16:19:33 EET 2019]\e[0m Splitting query scaffolds into contigs' [Thu Nov 14 16:19:33 EET 2019] Splitting query scaffolds into contigs
- /opt/MaSuRCA-3.3.4/bin/splitFileAtNs assembly.fasta.fixed 1
- touch .rerun
- '[' '!' -s assembly.fasta.fixed.split.posmap ']'
- log 'Mapping reads to query contigs' ++ date
- dddd='Thu Nov 14 16:19:34 EET 2019'
- echo -e '\e[0;32m[Thu Nov 14 16:19:34 EET 2019]\e[0m Mapping reads to query contigs' [Thu Nov 14 16:19:34 EET 2019] Mapping reads to query contigs
- /opt/MaSuRCA-3.3.4/bin/../CA8/Linux-amd64/bin/blasr -nproc 16 -bestn 1 SRR7167958_1.fastq.gz assembly.fasta.fixed.split
- awk '{if(($11-$10)/$12>0.75){if($4==0) print $1" "substr($2,4)" "$7" "$8" f"; else print $1" "substr($2,4)" "$9-$8" "$9-$7" r"}}'
- sort -nk2 -k3n -S 10% [INFO] 2019-11-14T16:19:34 [blasr] started. awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
Should I change something else in my options?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140?email_source=notifications&email_token=AGPXGHJMFPDJME24AW2CAJ3QTVO2LA5CNFSM4JHQY2LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEECBCHY#issuecomment-553914655, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHKOUN3KAUDO45IIK4TQTVO2LANCNFSM4JHQY2LA .
-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com
I will add checking for reads file extension in the next release. Unfortunately, it is not easy to accept gzipped files automatically, because I cannot pipe reads into blasr, it actually looks at file extension.
On Thu, Nov 14, 2019 at 10:08 AM Aleksey Zimin aleksey.zimin@gmail.com wrote:
Ah, here is the problem-- your reads are gzipped and thus blasr fails. Unzip them and it will work.
On Thu, Nov 14, 2019 at 9:35 AM Marios Gavrielatos < notifications@github.com> wrote:
Dear @alekseyzimin https://github.com/alekseyzimin, thank you for your response. I run the scaffolder with the following settings: /opt/MaSuRCA-3.3.4/bin/chromosome_scaffolder.sh -r ./GCF_000005245.1_dvir_caf1_genomic.fna -q ./assembly.fasta.fixed -t 16 -i 95 -m 100000 -v -s ./SRR7167958_1.fastq.gz -cl 3 -ch 64 and the output was:
- shift
- [[ 6 > 0 ]]
- key=-s
- case $key in
- READS=SRR7167958_1.fastq.gz
- shift
- shift
- [[ 4 > 0 ]]
- key=-cl
- case $key in
- COV_THRESH=3
- shift
- shift
- [[ 2 > 0 ]]
- key=-ch
- case $key in
- REP_COV_THRESH=64
- shift
- shift
- [[ 0 > 0 ]] ++ basename GCF_000005245.1_dvir_caf1_genomic.fna
- REF_CHR=GCF_000005245.1_dvir_caf1_genomic.fna ++ basename assembly.fasta.fixed
- HYB_CTG=assembly.fasta.fixed.split
- HYB_POS=assembly.fasta.fixed.split.posmap
- rm -rf .rerun
- '[' '!' -s assembly.fasta.fixed.split ']'
- log 'Splitting query scaffolds into contigs' ++ date
- dddd='Thu Nov 14 16:19:33 EET 2019'
- echo -e '\e[0;32m[Thu Nov 14 16:19:33 EET 2019]\e[0m Splitting query scaffolds into contigs' [Thu Nov 14 16:19:33 EET 2019] Splitting query scaffolds into contigs
- /opt/MaSuRCA-3.3.4/bin/splitFileAtNs assembly.fasta.fixed 1
- touch .rerun
- '[' '!' -s assembly.fasta.fixed.split.posmap ']'
- log 'Mapping reads to query contigs' ++ date
- dddd='Thu Nov 14 16:19:34 EET 2019'
- echo -e '\e[0;32m[Thu Nov 14 16:19:34 EET 2019]\e[0m Mapping reads to query contigs' [Thu Nov 14 16:19:34 EET 2019] Mapping reads to query contigs
- /opt/MaSuRCA-3.3.4/bin/../CA8/Linux-amd64/bin/blasr -nproc 16 -bestn 1 SRR7167958_1.fastq.gz assembly.fasta.fixed.split
- awk '{if(($11-$10)/$12>0.75){if($4==0) print $1" "substr($2,4)" "$7" "$8" f"; else print $1" "substr($2,4)" "$9-$8" "$9-$7" r"}}'
- sort -nk2 -k3n -S 10% [INFO] 2019-11-14T16:19:34 [blasr] started. awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
Should I change something else in my options?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140?email_source=notifications&email_token=AGPXGHJMFPDJME24AW2CAJ3QTVO2LA5CNFSM4JHQY2LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEECBCHY#issuecomment-553914655, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHKOUN3KAUDO45IIK4TQTVO2LANCNFSM4JHQY2LA .
-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com
-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com
Dear @alekseyzimin, I unzipped the file as you suggested and it worked. The scaffolding stopped after the following steps:
[Fri Nov 15 21:20:54 EET 2019] Splitting query scaffolds into contigs
+ /opt/MaSuRCA-3.3.4/bin/splitFileAtNs assembly.fasta.fixed 1
+ touch .rerun
+ '[' '!' -s assembly.fasta.fixed.split.posmap ']'
+ log 'Mapping reads to query contigs'
++ date
+ dddd='Fri Nov 15 21:20:57 EET 2019'
+ echo -e '\e[0;32m[Fri Nov 15 21:20:57 EET 2019]\e[0m Mapping reads to query contigs'
[Fri Nov 15 21:20:57 EET 2019] Mapping reads to query contigs
+ /opt/MaSuRCA-3.3.4/bin/../CA8/Linux-amd64/bin/blasr -nproc 16 -bestn 1 SRR7167958_1.fastq assembly.fasta.fixed.split
+ awk '{if(($11-$10)/$12>0.75){if($4==0) print $1" "substr($2,4)" "$7" "$8" f"; else print $1" "substr($2,4)" "$9-$8" "$9-$7" r"}}'
+ sort -nk2 -k3n -S 10%
[INFO] 2019-11-15T21:20:57 [blasr] started.
[INFO] 2019-11-16T02:27:21 [blasr] ended.
+ touch .rerun
+ '[' '!' -s GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.delta ']'
+ log 'Aligning query contigs to reference scaffolds'
++ date
+ dddd='Sat Nov 16 02:27:26 EET 2019'
+ echo -e '\e[0;32m[Sat Nov 16 02:27:26 EET 2019]\e[0m Aligning query contigs to reference scaffolds'
[Sat Nov 16 02:27:26 EET 2019] Aligning query contigs to reference scaffolds
+ /opt/MaSuRCA-3.3.4/bin/nucmer -t 16 -p GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split -c 200 GCA_007989325.1_vir160_genomic.fna assembly.fasta.fixed.split
+ touch .rerun
+ '[' '!' -s GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.1.delta ']'
+ log 'Filtering the alignments'
++ date
+ dddd='Sat Nov 16 02:30:02 EET 2019'
+ echo -e '\e[0;32m[Sat Nov 16 02:30:02 EET 2019]\e[0m Filtering the alignments'
[Sat Nov 16 02:30:02 EET 2019] Filtering the alignments
+ /opt/MaSuRCA-3.3.4/bin/delta-filter -1 -i 95 -o 20 GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.delta
+ touch .rerun
+ '[' '!' -s assembly.fasta.fixed.split.posmap.coverage ']'
+ log 'Computing read coverage for query contigs'
++ date
+ dddd='Sat Nov 16 02:30:03 EET 2019'
+ echo -e '\e[0;32m[Sat Nov 16 02:30:03 EET 2019]\e[0m Computing read coverage for query contigs'
[Sat Nov 16 02:30:03 EET 2019] Computing read coverage for query contigs
+ awk '{print $1" "$2" "$3"\n"$1" "$2" "$4}' assembly.fasta.fixed.split.posmap
+ grep -v F
+ grep -v R
+ sort -nk2 -k3n -S 10%
+ /opt/MaSuRCA-3.3.4/bin/compute_coverage.pl
The output files are:
0 Nov 16 02:30 assembly.fasta.fixed.split.posmap.coverage
4.6M Nov 16 02:30 GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.1.delta
0 Nov 16 02:30 .rerun
6.5M Nov 16 02:30 GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.delta
42M Nov 16 02:27 assembly.fasta.fixed.split.posmap
156M Nov 15 21:20 assembly.fasta.fixed.split
22K Nov 15 21:20 genome.asm
28K Nov 15 21:20 genome.posmap.ctgscf
20K Nov 15 21:20 scaffNameTranslations.txt
Having failure here would be unusual. Can you post a few lines (head -n 5) from the assembly.fasta.fixed.split.posmap file?
On Sat, Nov 16, 2019 at 2:02 AM Marios Gavrielatos notifications@github.com wrote:
Dear @alekseyzimin https://github.com/alekseyzimin, I unzipped the file as you suggested and it worked. The scaffolding stopped after the following steps:
[Fri Nov 15 21:20:54 EET 2019] Splitting query scaffolds into contigs
- /opt/MaSuRCA-3.3.4/bin/splitFileAtNs assembly.fasta.fixed 1
- touch .rerun
- '[' '!' -s assembly.fasta.fixed.split.posmap ']'
- log 'Mapping reads to query contigs' ++ date
- dddd='Fri Nov 15 21:20:57 EET 2019'
- echo -e '\e[0;32m[Fri Nov 15 21:20:57 EET 2019]\e[0m Mapping reads to query contigs' [Fri Nov 15 21:20:57 EET 2019] Mapping reads to query contigs
- /opt/MaSuRCA-3.3.4/bin/../CA8/Linux-amd64/bin/blasr -nproc 16 -bestn 1 SRR7167958_1.fastq assembly.fasta.fixed.split
- awk '{if(($11-$10)/$12>0.75){if($4==0) print $1" "substr($2,4)" "$7" "$8" f"; else print $1" "substr($2,4)" "$9-$8" "$9-$7" r"}}'
- sort -nk2 -k3n -S 10% [INFO] 2019-11-15T21:20:57 [blasr] started. [INFO] 2019-11-16T02:27:21 [blasr] ended.
- touch .rerun
- '[' '!' -s GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.delta ']'
- log 'Aligning query contigs to reference scaffolds' ++ date
- dddd='Sat Nov 16 02:27:26 EET 2019'
- echo -e '\e[0;32m[Sat Nov 16 02:27:26 EET 2019]\e[0m Aligning query contigs to reference scaffolds' [Sat Nov 16 02:27:26 EET 2019] Aligning query contigs to reference scaffolds
- /opt/MaSuRCA-3.3.4/bin/nucmer -t 16 -p GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split -c 200 GCA_007989325.1_vir160_genomic.fna assembly.fasta.fixed.split
- touch .rerun
- '[' '!' -s GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.1.delta ']'
- log 'Filtering the alignments' ++ date
- dddd='Sat Nov 16 02:30:02 EET 2019'
- echo -e '\e[0;32m[Sat Nov 16 02:30:02 EET 2019]\e[0m Filtering the alignments' [Sat Nov 16 02:30:02 EET 2019] Filtering the alignments
- /opt/MaSuRCA-3.3.4/bin/delta-filter -1 -i 95 -o 20 GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.delta
- touch .rerun
- '[' '!' -s assembly.fasta.fixed.split.posmap.coverage ']'
- log 'Computing read coverage for query contigs' ++ date
- dddd='Sat Nov 16 02:30:03 EET 2019'
- echo -e '\e[0;32m[Sat Nov 16 02:30:03 EET 2019]\e[0m Computing read coverage for query contigs' [Sat Nov 16 02:30:03 EET 2019] Computing read coverage for query contigs
- awk '{print $1" "$2" "$3"\n"$1" "$2" "$4}' assembly.fasta.fixed.split.posmap
- grep -v F
- grep -v R
- sort -nk2 -k3n -S 10%
- /opt/MaSuRCA-3.3.4/bin/compute_coverage.pl
The output files are:
0 Nov 16 02:30 assembly.fasta.fixed.split.posmap.coverage 4.6M Nov 16 02:30 GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.1.delta 0 Nov 16 02:30 .rerun 6.5M Nov 16 02:30 GCA_007989325.1_vir160_genomic.fna.assembly.fasta.fixed.split.delta 42M Nov 16 02:27 assembly.fasta.fixed.split.posmap 156M Nov 15 21:20 assembly.fasta.fixed.split 22K Nov 15 21:20 genome.asm 28K Nov 15 21:20 genome.posmap.ctgscf 20K Nov 15 21:20 scaffNameTranslations.txt
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140?email_source=notifications&email_token=AGPXGHNWIU7S6T5LV354FB3QT6LIFA5CNFSM4JHQY2LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHLHFQ#issuecomment-554611606, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHIJTK5LL7OXUIGNLSDQT6LIFANCNFSM4JHQY2LA .
-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com
These are the first 10 lines from the assembly.fasta.fixed.split.posmap file:
SRR7167958.1059890 7180000000000 449 6604 r
SRR7167958.460610 7180000000000 450 2527 f
SRR7167958.354243 7180000000000 916 5839 r
SRR7167958.950193 7180000000000 1042 5778 r
SRR7167958.1208632 7180000000000 1257 6258 f
SRR7167958.412011 7180000000000 1510 5774 r
SRR7167958.235508 7180000000000 1551 7287 r
SRR7167958.363957 7180000000000 1833 6025 r
SRR7167958.357056 7180000000000 2061 5779 f
SRR7167958.1092340 7180000000000 2364 13203 r
This is exactly what i was thinking. There is a bug/feature that is aimed at filtering out short reads. But if you only have short reads, the scaffolding will fail.
On Mon, Nov 18, 2019, 11:49 AM Marios Gavrielatos notifications@github.com wrote:
These are the first 10 lines from the assembly.fasta.fixed.split.posmap file:
SRR7167958.1059890 7180000000000 449 6604 r SRR7167958.460610 7180000000000 450 2527 f SRR7167958.354243 7180000000000 916 5839 r SRR7167958.950193 7180000000000 1042 5778 r SRR7167958.1208632 7180000000000 1257 6258 f SRR7167958.412011 7180000000000 1510 5774 r SRR7167958.235508 7180000000000 1551 7287 r SRR7167958.363957 7180000000000 1833 6025 r SRR7167958.357056 7180000000000 2061 5779 f SRR7167958.1092340 7180000000000 2364 13203 r
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140?email_source=notifications&email_token=AGPXGHLV5BEJ4Z6OEF4CJLLQULBSDA5CNFSM4JHQY2LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEELDSVY#issuecomment-555104599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHI6TTD5RV2WKHJRSWTQULBSDANCNFSM4JHQY2LA .
I just pushed version 3.3.5beta where the bug is fixed -- it amounts to simply removing "grep -v F | grep-v R from line in chromosome_scaffolder.sh. You can get this verison here:
https://github.com/alekseyzimin/masurca/blob/master/MaSuRCA-3.3.5b.tar.gz
Note that the new version of chromosome scaffolder is not compatible with the directory structure of the previous version -- you need to run it in the new folder.
To quickly fix your failure you can just remove grep -v F |grep -v R | from line 125 in MaSuRCA-3.3.4/bin/chromosome_scaffolder.sh and re-run after deleting assembly.fasta.fixed.split.posmap.coverage file.
--Aleksey
On Mon, Nov 18, 2019 at 12:08 PM Aleksey Zimin aleksey.zimin@gmail.com wrote:
This is exactly what i was thinking. There is a bug/feature that is aimed at filtering out short reads. But if you only have short reads, the scaffolding will fail.
On Mon, Nov 18, 2019, 11:49 AM Marios Gavrielatos < notifications@github.com> wrote:
These are the first 10 lines from the assembly.fasta.fixed.split.posmap file:
SRR7167958.1059890 7180000000000 449 6604 r SRR7167958.460610 7180000000000 450 2527 f SRR7167958.354243 7180000000000 916 5839 r SRR7167958.950193 7180000000000 1042 5778 r SRR7167958.1208632 7180000000000 1257 6258 f SRR7167958.412011 7180000000000 1510 5774 r SRR7167958.235508 7180000000000 1551 7287 r SRR7167958.363957 7180000000000 1833 6025 r SRR7167958.357056 7180000000000 2061 5779 f SRR7167958.1092340 7180000000000 2364 13203 r
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140?email_source=notifications&email_token=AGPXGHLV5BEJ4Z6OEF4CJLLQULBSDA5CNFSM4JHQY2LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEELDSVY#issuecomment-555104599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHI6TTD5RV2WKHJRSWTQULBSDANCNFSM4JHQY2LA .
-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com
Dear @alekseyzimin, I did as you suggested, I removed grep -v F |grep -v R | and the scaffolding finished successfully.
Thank you very much for your help!
You are welcome! Thank you for identifying this problem!
On Mon, Nov 18, 2019 at 12:47 PM Marios Gavrielatos < notifications@github.com> wrote:
Dear @alekseyzimin https://github.com/alekseyzimin, I did as you suggested, I removed grep -v F |grep -v R | and the scaffolding finished successfully.
Thank you very much for your help!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140?email_source=notifications&email_token=AGPXGHNW2NRJQVFIQALU3T3QULILXA5CNFSM4JHQY2LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEELJWDI#issuecomment-555129613, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHLTOPQCFHLLID44WZTQULILXANCNFSM4JHQY2LA .
-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com
Dear Professor @alekseyzimin I am using the masurca-polishing tool -Chromosome scaffolder with the below command and the following error was displayed:
chromosome_scaffolder.sh -r medref.fna -q mac3.fasta -t 36 [Friday 12 August 2022 11:32:30 AM IST] Computing gap coordinates in the reference [Friday 12 August 2022 11:33:22 AM IST] Splitting query scaffolds at >100bp gaps [Friday 12 August 2022 11:33:40 AM IST] Adding noise to reference to align to duplicated regions [Friday 12 August 2022 11:43:18 AM IST] Mapping reads to query contigs [Friday 12 August 2022 11:43:18 AM IST] Wrong type/extension for the file, must be .fa, .fasta or .fastq
But both my query and reference was in fasta format. Please suggest a solution
Dear Professor @alekseyzimin I am using the masurca-polishing tool -Chromosome scaffolder with the below command and the following error was displayed:
chromosome_scaffolder.sh -r medref.fna -q mac3.fasta -t 36 [Friday 12 August 2022 11:32:30 AM IST] Computing gap coordinates in the reference [Friday 12 August 2022 11:33:22 AM IST] Splitting query scaffolds at >100bp gaps [Friday 12 August 2022 11:33:40 AM IST] Adding noise to reference to align to duplicated regions [Friday 12 August 2022 11:43:18 AM IST] Mapping reads to query contigs [Friday 12 August 2022 11:43:18 AM IST] Wrong type/extension for the file, must be .fa, .fasta or .fastq
But both my query and reference was in fasta format. Please suggest a solution
same issue! how to solve?
Maybe this is because the extension for the extension for the medref.fna file is .fna and not fasta or fa? Try changing it to .fa, that may solve it.
On Tue, Oct 18, 2022 at 3:46 PM iremdnzl @.***> wrote:
Dear Professor @alekseyzimin https://github.com/alekseyzimin I am using the masurca-polishing tool -Chromosome scaffolder with the below command and the following error was displayed:
chromosome_scaffolder.sh -r medref.fna -q mac3.fasta -t 36 [Friday 12 August 2022 11:32:30 AM IST] Computing gap coordinates in the reference [Friday 12 August 2022 11:33:22 AM IST] Splitting query scaffolds at >100bp gaps [Friday 12 August 2022 11:33:40 AM IST] Adding noise to reference to align to duplicated regions [Friday 12 August 2022 11:43:18 AM IST] Mapping reads to query contigs [Friday 12 August 2022 11:43:18 AM IST] Wrong type/extension for the file, must be .fa, .fasta or .fastq
But both my query and reference was in fasta format. Please suggest a solution
same issue! how to solve?
— Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/140#issuecomment-1282919780, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHLZEC2FYMA5EVFBKNNGPI3WD347RANCNFSM4JHQY2LA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
i tried, didn't work. i also tried using them in their own folder (./ref.fa) but still the same, not working and giving the same error.
Dear Professor @alekseyzimin I am using Chromosome scaffolder.sh from MaSuRCA-4.0.9 by the command:chromosome_scaffolder.sh -r ../T2T_CHM13.V2.0_GCA_009914755.4_20220403/chm13v2.0.fa -q ../TGS_ZHU/asm/asm_fa/${i}.p_ctg.fa.gz -t 90 -nb -v . And the error is "cat: chm13v2.0.fa.22TF01547.asm.bp.hap1.p_ctg.fa.gz.split.reconciled.txt: No such file or directory". In fact, it has already produced the temporary file "chm13v2.0.fa.22TF01547.asm.bp.hap1.p_ctg.fa.gz.split.reconciled.txt.tmp". Would you help me to find out the problem?
Dear Professor Zimin, I am using the two new features of MaSuRCA-3.3.4 and I have encountered a couple of issues. First of all, while using the masurca-polishing tool I was using full paths to execute the command and the following error was displayed:
I managed to work pass this problem by creating soft links in the directory I was working in.
Second, I tried to use the chromosome scaffolder tool by running the following command:
The output was:
Afterwards, I run the tool without the -m option and the output was just
Unknown option 3
Finally after using the following command:The output was:
What should I change in order to run the tool successfully? Which are the default values of -t, -m and -v? Thank you in advance!
Kind regards, Marios