Closed tiagobrc closed 2 years ago
Hello Tiago,
Check out the --relabel
and --label_suffix
options. You could also change read labels using the linux sed
command.
Alternatively, adding the --sample
flag to vsearch could be helpful.
Colin
Hi @tiagobrc
with vsearch you can read from streams and write to streams, which allows for very flexible and powerful pipelines. For instance, you can merge many fastq samples while retaining the name of the original sample in sequence headers:
# create individual samples
for SAMPLE_NAME in sample{1..9}.fastq ; do
printf "@s\nA\n+\nI\n" > "${SAMPLE_NAME}"
done
# pool fastq samples, retain sample names in fastq headers
for SAMPLE_NAME in sample_*.fastq ; do
vsearch \
--quiet \
--fastx_filter "${SAMPLE_NAME}" \
--relabel "${SAMPLE_NAME/.fastq/}_" \
--fastqout -
done > all_samples.fastq
# clean up
rm sample{1..9}.fastq all_samples.fastq
You will end up with fastq entries formatted as such: "@SampleName_EntryNumber"
I think the vsearch (and usearch) option --label_suffix sample=";sample=SampleA;"
is equivalent to the usearch option -sample SampleA
. This option can currently be used with the fastq_mergepairs
and fastx_revcomp
commands.
In principle the --label_suffix
option could be used with almost any command that writes FASTA or FASTQ files, so I will enable it with many more commands in the next release.
I'll consider adding the --sample
option and perhaps the use of the @
symbol with the --relabel
option as well.
The --sample
option has been added in commit 34df253533db53e5d2fe0f91395a750e9c7f5862. The new option and the --label_suffix
option has been enabled for all commands that write FASTA or FASTQ files.
Added in version 2.21.0 just released
Is there a way in Vsearch to add sample identifiers to read labels just like in usearch (using the option -sample)?
I am trying to use vsearch but this option does not seem to exist.
Can you please confirm to me that is the case? Or point me to the equivalent parameter in vsearch?
Usearch explanation of the function:
Cordially,
Tiago