Open mhSepehri opened 1 month ago
Is there any way to use reference genomes of organisms other than hg19, hg38, mm9, and mm10?
Hi- I'm afraid this is not possible at the moment and implementing it has been on my todo list for a while - it shouldn't be too difficult. facets (the R package) supports it, though.
Note to myself - this is how it could be implemented:
--gbuild/-g
option should accept also a path to a bed file where the 4th column is the percentage GC in 1kb windows (users can use bedtools to prepare this file). Make a list (gcData
) from this file where items of the list are vectors of %CG (4th column). This the format in the pctGCdata package.
If gbuild
is a bed file, function make_header
can get the chrom sizes from gcData
.
In preProcSample
function call, if gbuild
is a bed file set set gbuild="udef", ugcpct=gcData
. Use ugcpct=NULL
otherwise.
Function reset_chroms
can probably do nothing if gbuild
is a bed file.
Alternatively:
If --gbuild
is a preset string, rename chromosomes as appropriate and assign the pctGCdata list to object gcData
. If gbuild
is a bed file, read it, make it a list of chroms and assign it to gcData
.
Proceed as if gbuild is always a custom genome and use gcData
instead.
Is there any way to use reference genomes of organisms other than hg19, hg38, mm9, and mm10?
I am trying to use facets for dog samples (canFam3) and the first part of generating pileup file works fine and creates an output with all 38 + x chromosomes output.csv.gz, but it cannot produce final results correctly.