Open malcook opened 4 years ago
Thanks for the suggestion. The reason why Genrich analyzes the whole genome by default, is because that is how these assays work. ATAC-seq, ChIP-seq, etc. are performed on whole genomes, not just certain chromosomes or regions.
Nevertheless, I will consider the request. In the meantime, please use -e
and -E
, and let me know if there are any issues with them.
Thanks for the consideration. It is really a convenience that allows me to trial run an analysis on a fraction of the genome in the interest of debugging a larger workflow on a limited set of data. I am able to use -e effectively for this purpose to exclude all but one chromosome.
Thanks for Genrich!
~ malcolm_cook@stowers.org
As a workaround, you can select the regions you want using bedtools intersect
.
bedtools intersect
is unlikely to produce the correct result in this context.
A parameter to provide genome length directly would also be very helpful. We subset data frequently to run multiple different peak callers with various parameters to find the best settings for a given assay.
There is now a -L <int>
CL argument that can be used to set the genome length directly.
During development of a pipeline involving Genrich for integrating ATAC seq with ChIP-Seq for multiple marks, I wish to only call peaks on a few small regions. For this reason, it is desirable to be able to specify which chromosomes or bed-regions to include.
The effective genome should then be the regions to include minus the regions to exclude.
This would allow me to tell Genrich to analyze, eg, chr8 only, minus any pre-computed global region black-list.
Finally, being able to specify chromosome to include or exclude using regular expression would be great. One useful expression would be `-i ^chr\d+$' to effectively remove (in the case of exnsembl zebrafish) chrM and an of the "unknown" chromosomal fragments matching "chrUn_*".
This feature would also simplify life for people seeking an easier way to #29.