Closed hidvegin closed 4 years ago
Hi @hidvegin,
Make soft links to your contigs and reads in your current working directory, and then specify the read files without any explicit path (instead of using full paths as you have now) -- that's what Tigmint expects and usually solves this sort of error.
If you various reads are in different files, the easiest thing to do would be to concatenate them into a single, interleaved, gzipped fastq file. If you wanted to, you could alter the barcodes to be specific per library( Ex. Library 1 10x barcodes BX:Z:<barcode>-1
, Library 2 BX:Z:<barcode>-2
, etc., also mentioned in #33 )
Thank you for your interest in Tigmint! Lauren
Dear @lcoombe,
Thank you for your answer. What is the optimal CPU usage for tigmint? Should I use more than 8 CPU? Now, I tried tigmint-make
with -t8
.
Hi @hidvegin - Generally, using more CPU will be better, especially for the alignment stage, so it really just depends on the limitations of your machine.
Tigmint can continue a stoped job?
If by a stopped job, you mean that it can resume a partial run part-way through, yes. It is based on a Makefile, which is a set of rules that will be executed. If it detects that a file has already been made, it will not re-make that file.
If you want to see where it will start again, use the dry-run option -n
in the tigmint-make
command, which will print out the commands that will be run, without executing them.
I tried resume tigmint-make
with 80 CPU but -t
paramater was bad. I tried also the tigmint-make --jobs=80
, but it seems also not good because bwa mem
use only 8 CPU with -t8
. How should I add paramater for tigmint-make
to use all of the 80 CPU?
Since tigmint-make
is a Makefile, you specify parameters like this: t=80
.
See this part of the README for more examples: https://github.com/bcgsc/tigmint#parameters-of-tigmint
Thank you. It helped a lot. Now, it is working correctly.
Hi @lcoombe,
tigmint-make
finished the scaffolding with contigs and unitigs also which generated with canu
. In the draft.tigmint.arcs.fa file I found several scaffolds with 1, 2 or 3 bp lenght which was not there in the contigs or unitigs file. The contigs and unitigs have got 1000 bp or larger sequences. How should set the parameters in tigmint-make to prevent this short sequences exist?
Hi @hidvegin,
Those small sequences are a product of how tigmint
decides on the location of cuts. In short, at a putative misassembly (ie. Tigmint doesn't find any spanning molecules along the sliding window), it is possible that 1 or 2 cuts will be made - if there are two cuts, they are usually quite close to each other and can lead to the small sequence(s). This roughly depends on if it is a blunt misassembly or mediated by a repeat sequence. You could take a look at the methods in the Tigmint paper if you want more detail (there is some pseudocode there that describes how the cut points are decided).
If you don't want any of the small sequences in there, I'd suggest just doing a post-Tigmint step to filter them out (ex. using seqtk seq
or a one-liner).
Hi @lcoombe,
How could I filter them out with seqtk seq
? How could I decide which sequences should I filter them out?
Hi @hidvegin,
You could decide on a length threshold that you want for your assembly (call it 'x'), and use this command:
seqtk seq -L x my_fasta.tigmint.fa > my_fasta.tigmint.Lx.fa
They are all valid sequences, so it would be up to you to decide on a length filter approporiate for your particular assembly project.
Thank @lcoombe for your answer. I have got a 150x Illumina PE reads from the same plant genome to this scaffold sequences. What is the most suitable tool for correct this draft genome which I generated in tigmint-make arcs
? Maybe Sealer
or ntEdit
?
@hidvegin - No worries!
Yes, if it is more polishing and assembly finishing that you are looking for Sealer and ntEdit are good options. Sealer will fill gaps in your existing assembly, and ntEdit performs assembly polishing.
Thank you @lcoombe. Should I use the same linked reads also with the Illumina PE reads in Sealer and ntEdit? I have got Illumina PE reads from mRNA-seq. Should I use it to improve the draft genome in Sealer or ntEdit?
Hi @hidvegin - Yes, I'd suggest using the same linked reads with Sealer and ntEdit. I'd be more hesitant to use the RNA-seq reads with these tools, since they would be limited to improving the genic space only (vs the genomic reads, which could improve genic + all other regions of the genome assembly)
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your interest in Tigmint!
I have got a draft genome from 30x PacBio for about 4 Gbp plant genome. I would like to use 60x 10xGenomics linked reads for correcting. I have got 4 library from 10xGenomics. I tried tigmint with dry run. This was the parameters which I used:
tigmint-make arcs -n draft=$HOME/szeged/fk8jybr/input/pacbio_assembled_canu/lculinaris.contigs reads=$HOME/szeged/fk8jybr/input/Illumina_10x/LC001 $HOME/szeged/fk8jybr/input/Illumina_10x/LC002 $HOME/szeged/fk8jybr/input/Illumina_10x/LC003 $HOME/szeged/fk8jybr/input/Illumina_10x/LC004
I got this output message:
How could I set the paramers for use all of 4 linked read libraries?