nioo-knaw / epiGBS2

This is the epiGBS2 snakemake pipeline as published in a preprint version.
MIT License
2 stars 6 forks source link

Process_radtags parameters #12

Open MadokaSuga opened 2 years ago

MadokaSuga commented 2 years ago

Hello Maarten,

I have questions concerning the paramenters you use for demultiplexing.

If I understand correctly you do not allow mismatches in the barcode (--barcode_dist 0) and you do not disable the rad-check. Is that important for the pipeline? Given the fact that the bisulfite conversion will change the restriction site (depending on which enzyme you use) should we not turn off the rad-check in order to recover more reads?

example: with the restriction enzyme MspI (C^CGG) we observe the non modified CGG, but also TGG and CGA (I think all explainable by bisulfite conversion and subsequent PCR) which would be lost when not disabling the rad-check?

Thanks for the info,

Madoka

MaartenPostuma commented 2 years ago

Hi Madoka, I have not observed large amounts of read logs in my experiments using AseI (AT^TA) and NsiI (T^ACG). When looking in the demultiplexing log file output/output_demutliplex/process_radtags.log, you can see how many reads are lost. In my experiment this did not exceed 2% of the demultiplexed reads, although my restriction sites might be more robust to bisulphite conversion. Moreover, we use -r (rescue) flag which will correct barcodes and the RAD cutsite if it is no more than one nucleotide different from the specified restriction enzyme. Furthmore it might be interesting to look further into this log file as it also includes a section on "denovo" barcodes, which are barcode combinations that the program found which were not in the barcode file. This is also a good place to start to look for possible reasons for low amount of read retention during demultiplexing. Greetings, Maarten