MikkelSchubert / adapterremoval

AdapterRemoval v2 - rapid adapter trimming, identification, and read merging
http://adapterremoval.readthedocs.io/
GNU General Public License v3.0
104 stars 23 forks source link

How to construct the adapter list/specify adapters #33

Closed yonist closed 5 years ago

yonist commented 5 years ago

Hi,

I need to remove adapters from a paired-end samples that were prepared with "TruSeq DNA PCR-Free HT" (it can be seen here at page 15 (https://dnatech.genomecenter.ucdavis.edu/wp-content/uploads/2013/06/illumina-adapter-sequences_1000000002694-00.pdf)

If I understand correctly the first adapter (--adapter1 /first column in the file) is the one I expect to find in one of the files and the second adapter (--adapter2/second column in the file). However I've no way to know which is which without consulting the provider?

Also do I need to provide the adapters as is (as defined in the documentations) or to reverse complement one of them? (I couldn't really tell what is fixed with issue #31 )

Thanks, Yoni

MikkelSchubert commented 5 years ago

Hi Yoni,

The basic rule for supplying adapter sequences to AdapterRemoval is that you should be able to grep the supplied --adapter1 and --adapter2 in your --file1 and --file2 files, respectively*. If I remember correctly, your --adapter2 sequence should be the reverse complement of the sequence listed in that document. You can either add the specific barcode you've used, or simply replace it with the corresponding number of Ns.

For PE data you can also use the --identify-adapters option to make AdapterrRemoval attempt to reconstruct the original adapter sequences. I'd generally recommend using --identify-adapters when possible, since you'll occasionally run into unexpected changes to the adapter sequences in your data, that can negatively impact the adapter trimming performance if not accounted for.

The bug that was fixed in #31 applies only when you use the --adapter-list option to supply a table containing one or more adapter pairs. The bug resulted in AdapterRemoval expecting the mate 2 adapters to the reverse complement of the value you would supply to --adapter2, which was not intended. This bug/fix should not affect you, since you only have one pair of adapters.

Best, Mikkel Schubert

* This is of course not not always possible in practice, since your data might not contain the full adapter sequence on any reads.

yonist commented 5 years ago

Thank you for your extensive answer.