Closed LisaBlazek closed 4 years ago
Hi @LisaBlazek,
I'm glad you got LINKS working OK after installing with anaconda!
-t
is useful to play around with if you are finding that LINKS is using too much RAM - It influences the step size of the sliding window when extracting k-mer pairs. Generally, when I'm doing runs I will set -t
as low as I can while ensuring that the run uses a reasonable amount of RAM for my machine. The lower you set it, the more k-mer pairs you can extract, which can be used for scaffolding evidence, but the RAM usage will be higher.
Yes, we routinely will run LINKS with multiple distances (-d
), going from lower distances to higher to be able to best use the shorter range and longer range information in the long reads. Generally, as you increase -d
, you can decrease -t
, since you start to extract fewer k-mer pairs as you increase the distance between them. But again, the -t
setting at any given -d
is mostly related to your RAM availability.
Hope that helps! Lauren
thanks @lcoombe
@LisaBlazek, if you have a smallish genome and RAM isn't too much of an issue, you can, alternatively to iterative runs, do a single run with all kmer pair distances combined.
you would set it like such -d 20000,15000,10000,5000 -t 50,20
note: if you omit values of -t
as in the example above, the last value (e.g.20) will also be used when extracting kmer pairs at the shorter (e.g. 10000 and 5000) distances.
Good luck and thank you for using LINKS!
thank you both for the fast replay :) now it makes more sense!
hello! after installing LINKS with anaconda, running it is pretty easy and straight forward!
but now i have a question regarding the parameters: -k, -d and -t -k is the most straight forward one, but i have problems in understanding how these three play together and how they influence the scaffolding process. i tried to look into the closed tickets and i got some information, but im still not sure when to fine-tune what and what the consequences will be. would it be possible for you to describe it a little bit more, that i get a better understanding? i'm totally aware that there is no strick recipe to follow. that why i would like to better understand what the parameters are doing and what will happen if i change them in one or the other direction.
for one i have problems understanding what -t does, so a lower -t less RAM a higher -t more RAM? so how does it influence then the scaffolding? the default is -t 2 but what does a -t 4 or -t 100?
for -d what i understand for many small scaffolds a smaller -d is advised and with less larger scaffolds a bigger -d, so if you have a assembly with a combination multiple distances should be used? and how does -t have an influence on such assemblies?
it would be nice if you have time for answering this ticket! i really like working with this program and hope to improve my outcome further :)