alekseyzimin / masurca

GNU General Public License v3.0
236 stars 35 forks source link

The polca.sh don't work if -r have two sequence #146

Open flystar233 opened 4 years ago

flystar233 commented 4 years ago

ONE: polca.sh -a tangyu.racon.v2.fasta -r 'wHADPI077755-37_1.fq.gz' [Wed Dec 18 23:43:13 CST 2019] Creating BWA index for tangyu.racon.v2.fasta

TWO as REDME said: polca.sh -a tangyu.racon.v2.fasta -r 'wHADPI077755-37_1.fq.gz wHADPI077755-37_2.fq.gz' Input files not found or not specified!

flystar233 commented 4 years ago

@alekseyzimin

alekseyzimin commented 4 years ago

Hello,

To specify two Illumina sequences please use -r 'file1.fastq file2.fastq'

--Aleksey

On Thu, Dec 19, 2019 at 4:43 AM flystar notifications@github.com wrote:

@alekseyzimin https://github.com/alekseyzimin

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/146?email_source=notifications&email_token=AGPXGHKNETE644C7CK3XYI3QZM65NA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHJBB5I#issuecomment-567415029, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHOHTGE7S6DKFLAOVMDQZM65NANCNFSM4J4NBKEA .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

flystar233 commented 4 years ago

Thank you for your reply, But this still does not solve the problem.

cat polish_POLCA.sh polca.sh -a tangyu.racon.v2.fasta -r 'wHADPI077755-37_1.fq wHADPI077755-37_2.fq' -t 16 -m 5G

sh polish_POLCA.sh Input files not found or not specified! Usage: polca.sh -a -r <'Illumina_reads_fastq1 Illumina_reads_fastq'> -t [-n] [-m]

Hello, To specify two Illumina sequences please use -r 'file1.fastq file2.fastq' --Aleksey On Thu, Dec 19, 2019 at 4:43 AM flystar @.***> wrote: @alekseyzimin https://github.com/alekseyzimin — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AGPXGHKNETE644C7CK3XYI3QZM65NA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHJBB5I#issuecomment-567415029>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHOHTGE7S6DKFLAOVMDQZM65NANCNFSM4J4NBKEA . -- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

alekseyzimin commented 4 years ago

Thank you for reporting this bug, I fixed it and updated the release. You can simply download the new release and copy polca.sh from global-1/PacBio/src_reconcile/polca.sh to the MaSuRCA bin folder. Then run chmod 0755 bin/polca.sh.

Or you can delete and reinstall maSuRCA.

On Thu, Dec 19, 2019 at 9:19 AM flystar notifications@github.com wrote:

Thank you for your reply, But this still does not solve the problem.

cat polish_POLCA.sh polca.sh -a tangyu.racon.v2.fasta -r 'wHADPI077755-37_1.fq wHADPI077755-37_2.fq' -t 16 -m 5G

sh polish_POLCA.sh Input files not found or not specified! Usage: polca.sh -a -r <'Illumina_reads_fastq1 Illumina_reads_fastq'> -t [-n] [-m] <optional: memory per thread to use in samtools sort>

Hello, To specify two Illumina sequences please use -r 'file1.fastq file2.fastq' … <#m-2792402766841958930> --Aleksey On Thu, Dec 19, 2019 at 4:43 AM flystar @.***> wrote: @alekseyzimin https://github.com/alekseyzimin https://github.com/alekseyzimin — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#146 https://github.com/alekseyzimin/masurca/issues/146?email_source=notifications&email_token=AGPXGHKNETE644C7CK3XYI3QZM65NA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHJBB5I#issuecomment-567415029>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHOHTGE7S6DKFLAOVMDQZM65NANCNFSM4J4NBKEA . -- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/146?email_source=notifications&email_token=AGPXGHNU6VHCILIHYGJ6VFTQZN7GXA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHJXYEQ#issuecomment-567507986, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHLFW24HF6Y7FZ5IQHTQZN7GXANCNFSM4J4NBKEA .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

flystar233 commented 4 years ago

it works normally, Thanks

alekseyzimin commented 4 years ago

Thank you for confirming. Did it run faster than racon/pilon?

On Thu, Dec 19, 2019 at 9:54 PM flystar notifications@github.com wrote:

it works normally, Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/146?email_source=notifications&email_token=AGPXGHON47PU73Y4SMVRHNDQZQXVVA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHLWVEI#issuecomment-567765649, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHOCAV3YRNLO7T75HUTQZQXVVANCNFSM4J4NBKEA .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

PerisD commented 4 years ago

Hi @alekseyzimin , I have some comments about the Polca.sh.

  1. It looks that the reads must be gzipped, as zcat is called. So I think you must specify that the input reads must be compressed.
  2. Polca.sh line 133 is missing the -o flag before $BASM.alignSorted
  3. Polca.sh line 133 is missing the .bam suffix ($BASM.alignSorted), which is called in line 114 ($BASM.alignSorted.bam)
  4. [Recommendation]: I would modify masurca name by CorrectedAssembly as others must use POLCA to correct non-masurca assemblies. The rest looks fine, and the program ran fast (2 minutes for yeast assembly of 12Mb Genome Size). Cheers, Peris
alekseyzimin commented 4 years ago

Thank you for your suggestions

  1. I use zcat -f so the input reads do not have to be gzipped 2&3. samtools sort requires prefix as input, it adds .bam extension on its own, thus no extension, alternatively I could do "-o $BASM.alignSorted.bam"
  2. I will modify the corrected name to .PolcaCorrected.fa in the future release

On Sat, Jan 4, 2020 at 6:03 AM PerisD notifications@github.com wrote:

Hi @alekseyzimin https://github.com/alekseyzimin , I have some comments about the Polca.sh.

  1. It looks that the reads must be gzipped, as zcat is called. So I think you must specify that the input reads must be compressed.
  2. Polca.sh line 133 is missing the -o flag before $BASM.alignSorted
  3. Polca.sh line 133 is missing the .bam suffix ($BASM.alignSorted), which is called in line 114 ($BASM.alignSorted.bam)
  4. [Recommendation]: I would modify masurca name by CorrectedAssembly as others must use POLCA to correct non-masurca assemblies. The rest looks fine, and the program ran fast (2 minutes for yeast assembly of 12Mb Genome Size). Cheers, Peris

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/146?email_source=notifications&email_token=AGPXGHMPLTWVLNAPCQSD6YDQ4BUHVA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEICVUGI#issuecomment-570776089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHPFFFSJVD433UN27XDQ4BUHVANCNFSM4J4NBKEA .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

jpummil commented 4 years ago

it works normally, Thanks

Aleksey, the fix doesn't seem to resolve my issue. Same issue as flystar233...

[pummill@login006 CABOG]$ ~/MaSuRCA-3.3.5/bin/polca.sh -a Cabog-Assembly-012020.fasta -r 'NU2WGS_R1.fastq NU2WGS_R2.fastq' -t 32 -m 32 Input files not found or not specified! Usage: polca.sh -a -r <'Illumina_reads_fastq1 Illumina_reads_fastq'> -t [-n] [-m]

[pummill@login006 CABOG]$ ls -lh NU* -rw-r--r-- 1 pummill ca8hebp 13G Jan 21 14:31 NU2WGS_R1.fastq -rw-r--r-- 1 pummill ca8hebp 13G Jan 21 14:31 NU2WGS_R2.fastq

[pummill@login006 CABOG]$ ~/MaSuRCA-3.3.5/bin/masurca -version version 3.3.5

As I only downloaded and built this version of MaSuRCA last week, I assumed that polka.sh was probably updated, but it doesn't seem so. Then I just copied the polca.sh from global-1/PacBio/src_reconcile/polca.sh to the MaSuRCA bin folder and ran chmod 0755 bin/polca.sh on it. No change...

alekseyzimin commented 4 years ago

Hello, I just downloaded MaSuRCA 3.3.5 from the release page: https://github.com/alekseyzimin/masurca/releases/download/v3.3.5/MaSuRCA-3.3.5.tar.gz and verified that polca.sh has been updated in the release. There is no error message "Input files not found or not specified!" in the script, it has been replaced by "Input file $ASM not found or not specified!", see commit https://github.com/alekseyzimin/PacBio/commit/283df7847bed279fcdeb81973bbd248b5eabe645 Please check your polca.sh and verify that you have the updated version. Best, Aleksey

On Tue, Jan 21, 2020 at 3:06 PM Jeff Pummill notifications@github.com wrote:

it works normally, Thanks

Aleksey, the fix doesn't seem to resolve my issue. Same issue as flystar233...

[pummill@login006 CABOG]$ ~/MaSuRCA-3.3.5/bin/polca.sh -a Cabog-Assembly-012020.fasta -r 'NU2WGS_R1.fastq NU2WGS_R2.fastq' -t 32 -m 32 Input files not found or not specified! Usage: polca.sh -a -r <'Illumina_reads_fastq1 Illumina_reads_fastq'> -t [-n] [-m] <optional: memory per thread to use in samtools sort>

[pummill@login006 CABOG]$ ls -lh NU* -rw-r--r-- 1 pummill ca8hebp 13G Jan 21 14:31 NU2WGS_R1.fastq -rw-r--r-- 1 pummill ca8hebp 13G Jan 21 14:31 NU2WGS_R2.fastq

[pummill@login006 CABOG]$ ~/MaSuRCA-3.3.5/bin/masurca -version version 3.3.5

As I only downloaded and built this version of MaSuRCA last week, I assumed that polka.sh was probably updated, but it doesn't seem so. Then I just copied the polca.sh from global-1/PacBio/src_reconcile/polca.sh to the MaSuRCA bin folder and ran chmod 0755 bin/polca.sh on it. No change...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/146?email_source=notifications&email_token=AGPXGHNAYBVWFUQHVFDUSHDQ65IWDA5CNFSM4J4NBKEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJRCM5I#issuecomment-576857717, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGPXGHPQ3363ZQ3UX7OPCZTQ65IWDANCNFSM4J4NBKEA .

-- Dr. Alexey V. Zimin Associate Research Scientist Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA (301)-437-6260 website http://ccb.jhu.edu/people/alekseyz/ blog http://masurca.blogspot.com

jpummil commented 4 years ago

Thanks so much Aleksey!

Not sure what I did wrong the first time, but this time did indeed work! Polca is up and running! Really appreciate your quick response to get me sorted again.

--Jeff