bcgsc / tigmint

⛓ Correct misassemblies using linked AND long reads
https://bcgsc.github.io/tigmint/
GNU General Public License v3.0
54 stars 13 forks source link

gsed: command not found #21

Closed mossishahi closed 6 years ago

mossishahi commented 6 years ago

I faced an error following tigmint-arcs pipeline while running gsed command. I attempted to install it using brew install gnu-sed ,but it caused the error gnu-sed cannot be built with any available compilers. and also error. does it make difference to run sed instead the gsed?

lcoombe commented 6 years ago

Hi @mossishahi - Are you using the latest release? I can see this sed command in the latest release (not gsed):

# Rename the scaffolds.
%.links.fa: %.links.scaffolds.fa
    sed -r 's/^>scaffold([^,]*),(.*)/>\1 scaffold\1,\2/' $< >$@

Otherwise, could you specify which command is causing the failure? Thanks!

mossishahi commented 6 years ago

@lcoombe I use Tigmint 1.1.0 , and I am trying to run the command gsed -r 's/^>scaffold([^,]*),(.*)/>\1 scaffold\1,\2/'DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.scaffolds.fa>DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.fa in the pipeline suggested by tigmint-make

lcoombe commented 6 years ago

@mossishahi - I suggest that you use the latest release 1.1.2. This version corrects a significant bug found in 1.1.0.

mossishahi commented 6 years ago

@lcoombe we have followed a time consuming pipeline of Tigmint-Arcs like this:

* bwa index DjScaff_fnl20141213.renamed.fa
* bwa mem -t8 -pC DjScaff_fnl20141213.renamed.fa barcoded.fq.gz | samtools view -u -F4 | samtools sort -@8 -tBX -T$(mktemp -u -t DjScaff_fnl20141213.renamed.barcoded.sortbx.bam.XXXXXX) -o DjScaff_fnl20141213.renamed.barcoded.sortbx.bam
* /s/chopin/a/grad/asharifi/e/bin/tigmint-molecule -a0.65 -n5 -q0 -d50000 -o DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.bed DjScaff_fnl20141213.renamed.barcoded.sortbx.bam
* awk '$3 - $2 >= 2000' DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.bed >DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.bed
* samtools faidx DjScaff_fnl20141213.renamed.fa
* /s/chopin/a/grad/asharifi/e/bin/tigmint-cut -p8 -w1000 -n20 -t0 -o DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa DjScaff_fnl20141213.renamed.fa DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.bed
* bwa index DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa
* bwa mem -t8 -pC DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa barcoded.fq.gz | samtools view -@8 -h -F4 -o DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.sortn.bam
* arcs -s98 -c5 -l0 -z500 -m4-20000 -d0 -e30000 -r0.05 -v \
        -f DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa \
        -b DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs \
        -g DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.dist.gv \
        --tsv=DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.dist.tsv \
        --barcode-counts=DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.sortn.bam.barcode-counts.tsv \
        DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.sortn.bam
* /s/chopin/a/grad/asharifi/e/bin/tigmint-arcs-tsv DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs_original.gv DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.links.tsv DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa
* cp DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.links.tsv DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.tigpair_checkpoint.tsv
* LINKS -k20 -l10 -t2 -a0.1 -x1 -s /dev/null -f DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa -b DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links
gsed -r 's/^>scaffold([^,]*),(.*)/>\1 scaffold\1,\2/' DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.scaffolds.fa >DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.fa
ln -sf DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.fa DjScaff_fnl20141213.renamed.tigmint.arcs.fa

Is it required to re-run all of it?

lcoombe commented 6 years ago

@mossishahi - Since version 1.1.0, bugs were fixed in both tigmint-molecule and tigmint-make. I would recommend going back to the tigmint-molecule step. Take a look at the releases page for more details about the bugs fixed.

If you aren't already, I do highly recommend using the tigmint-make Makefile rather than running each command separately.

mossishahi commented 6 years ago

@lcoombe I installed the latest version of Tigmint. As you suggested I ran the command tigmint -make arcs -n draft=DjScaff_fnl20141213.renamed reads=barcoded to see what commands are supposed to be run and remove -n after it. but the result only includes: ln -sf DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.a0.1_l10.links.fa DjScaff_fnl20141213.renamed.tigmint.arcs.fa but you mentioned that more commands should be run.

lcoombe commented 6 years ago

@mossishahi - I'm assuming that you are running that command in the same directory where you ran the commands previously - The makefile is detecting that certain 'target' files are already present, so it is not re-making them. You can move all the files generated after the initial bwa mem alignment to a temporary folder (so the makefile won't see them, but in case you don't want to delete them), and re-run the above command.

mossishahi commented 6 years ago

@lcoombe I'm running on the same directory. I see what do recommend us but if I run the make file and don't run the commands one by one, the make file will run the bwa index and bwa mem again. Is it ok to keep the output of these commands and move others to temp ?

mossishahi commented 6 years ago

another question that how is it possible to specify number of threads in tigmint-make arcs?

lcoombe commented 6 years ago

@mossishahi - Yes, that is what I meant by moving all files generated after the initial bwa mem alignment -- keep the bwa index file and the DjScaff_fnl20141213.renamed.barcoded.sortbx.bam file in the same directory and move the others to a temp directory. Then, the Makefile should start up again after the first bwa mem alignment step. Sorry if that was unclear!

Take a look at the README.md file, which lists all of the parameters. Setting t specifies the number of threads (ex. t=8)

mossishahi commented 6 years ago

@lcoombe thanks alot. I assumed that there is no thread option for tigmint-make arcs command. Another question: why did you insist on running the make file, rather than running the commands one by one. Does it affect the results?

lcoombe commented 6 years ago

@mossishahi - No problem!

You will get the same results whether you run the commands separately or use the Makefile, but you make your life a lot easier by using the Makefile. Then, you can just launch one command rather than many individual ones - that's why we package the pipeline in this way. Also, it reduces the chance for small mistakes in the command line, or missing important changes that we make to the Makefile.

mossishahi commented 6 years ago

@lcoombe Unfortunately, while running the below command

arcs -s98 -c5 -l0 -z500 -m4-20000 -d0 -e30000 -r0.05 -v \
        -f DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa \
        -b DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs \
        -g DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.dist.gv \
        --tsv=DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs.dist.tsv \
        --barcode-counts=DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.sortn.bam.barcode-counts.tsv \
        DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.sortn.bam

I faced this error. However, I had done the same command before updating Tigmint, just passing this step not having any error

Reading alignments: DjScaff_fnl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.sortn.bam
error: mismatched sequence lengths: sequence 34565-6: 151 != 161make: *** [/s/chopin/a/grad/asharifi/e/Applications/tigmint1.1.2/tigmint/bin/tigmint-make:221: DjScaff_f
nl20141213.renamed.barcoded.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.barcoded.c5_e30000_r0.05.arcs_original.gv] Error 1
lcoombe commented 6 years ago

@mossishahi - Please ensure you are using the latest release of ARCS. This error looks like an issue that was resolved in ARCS v1.0.4.

mossishahi commented 6 years ago

@lcoombe thanks for your quick response. I will check that

lcoombe commented 6 years ago

@mossishahi - Looks like you were able to get Tigmint+ARCS running OK, and the gsed error was solved. I'm closing this issue - feel free to open another one if you have further problems!