Closed jeremymsimon closed 1 year ago
Now the names are kept in the output:
> getRegionGeneAssociations(job)
GRanges object with 7 ranges and 2 metadata columns:
seqnames ranges strand | annotated_genes dist_to_TSS
<Rle> <IRanges> <Rle> | <CharacterList> <IntegerList>
region1 chr1 144369564-144373476 * | NBPF9,PPIAL4B -440228,-7274
region3 chr3 146551327-146559750 * | PLSCR5,ZIC4 -231536,566532
region4 chr4 8691809-8698816 * | CPZ,HMX1 100926,178230
region5 chr5 149675764-149677003 * | CAMK2A,ARSI -7192,6132
region6 chr6 78404287-78413825 * | MEI4 8681
region7 chr7 153478599-153486676 * | DPP6 -267127
region8 chrX 86523878-86533664 * | KLHL4 -244046
-------
seqinfo: 7 sequences from an unspecified genome; no seqlengths
If the input is a GRanges
object with names, the names are kept. If the input is a data frame, you need to specify use_name_column = TRUE
in submitGreatJob()
.
For the second question, the final number of regions is not always the same as what you have set. The reason is I let the proportion of random regions to be the same as the chromosome lengths, which means chromosome 1 has more random genes than chromosome 19.
Please update the package from GitHub.
Thanks @jokergoo! I'll give this a try. I will close this for now and reopen if I have any issues
Hi- This package is amazing to include in our workflows!
One question: on the web version, GREAT accepts a "name" field as part of the standard BED format: https://great-help.atlassian.net/wiki/spaces/GREAT/pages/655452/File+Formats#FileFormats-Whatshouldmytestregionsfilecontain%3F
If I include this in my BED-like data.frame or GenomicRanges object, it seems to get ignored and lost in the results table. Ideally I want
getRegionGeneAssociations(job)
to includechr
,start
,end
, andname
if it is present. This would be especially helpful if my input is in a particular order, since the results table is re-sorted I can't simply applynames(output) <- names(input)
Is it possible to add this functionality?
Example here:
Now the same as BED format submitted to web tool:
As an additional side note, why am I only getting 8 regions when
nr=10
?Thanks!