Closed rdmorin closed 10 months ago
Thanks for reporting Ryan.
I just had a quick look, and it seems that there are no genes residing inside the specified region (in the example you provided). Could this be the reason that the function "often" fails for you? Regardless, it would be useful to have the function notify the user if there are no genes found within the specified region.
For example, if I expand the region in your example by 2000bp we get this back:
> region_to_gene(region="chr14:105861865-105864058",genome_build = "hg38")
3 gene(s) returned for 1 region(s)
chromosome start end hugo_symbol region_start region_end
1 chr14 105863197 105863258 IGHJ6 105861865 105864058
2 chr14 105863415 105863465 IGHJ3P 105861865 105864058
3 chr14 105863813 105863862 IGHJ5 105861865 105864058
As a note, I have ran in to similar issues when I (mistakenly) provide a region to this function that is in respect to another projection, than what is specified within the call of the region_to_gene
function.
Yes, that's likely the cause. Do you think it should throw an error (as it's currently doing) or elegantly return an empty data frame instead? The issue I'm having is that calling this function within another function currently requires the call to be wrapped in a tryCatch().
If we have it return an empty data frame (but also a message so the user would be aware) I would assume that this wouldn't break in a nested environment, like you're describing. So I would vote for this.
The messages should only be used in verbose mode though, right? Currently it's already printing out a message for each region the function is given, which leads to a very verbose output if a function calls it repeatedly.
For example, here's the current output (with just the region being printed in the calling function for debugging purposes):
[1] "chr1:92914901-92915003"
1 gene(s) returned for 1 region(s)
[1] "chr5:10400752-10401894"
1 gene(s) returned for 1 region(s)
[1] "chr14:105773847-105859779"
7 gene(s) returned for 1 region(s)
[1] "chr20:17966979-17967114"
2 gene(s) returned for 1 region(s)
[1] "chr22:22322826-22355878"
7 gene(s) returned for 1 region(s)
[1] "chr6:29810026-29904500"
8 gene(s) returned for 1 region(s)
[1] "chr7:70496350-70496767"
1 gene(s) returned for 1 region(s)
[1] "chrX:1087815-1088665"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr14:64458111-64461841"
2 gene(s) returned for 1 region(s)
[1] "chr5:79031008-79032539"
1 gene(s) returned for 1 region(s)
[1] "chr6:32654870-32754078"
7 gene(s) returned for 1 region(s)
[1] "chr17:13733571-13734989"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr13:88396330-88398128"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr15:40050289-40051291"
1 gene(s) returned for 1 region(s)
[1] "chr18:15656295-15702692"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr22:31226197-31227014"
1 gene(s) returned for 1 region(s)
[1] "chr1:182114358-182114666"
1 gene(s) returned for 1 region(s)
[1] "chr14:105773233-105858897"
6 gene(s) returned for 1 region(s)
[1] "chr2:88860382-88860480"
1 gene(s) returned for 1 region(s)
[1] "chr5:50456310-50468640"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr5:105836364-105839254"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr14:105773387-105859809"
7 gene(s) returned for 1 region(s)
[1] "chr17:49796951-49803952"
1 gene(s) returned for 1 region(s)
[1] "chr1:124945634-124955006"
Here's the original error message:
missing value where TRUE/FALSE needed[1] "chr14:105859052-105859507"
1 gene(s) returned for 1 region(s)
[1] "chr12:11380951-11420570"
Agree! I will provide an update for this function. Thanks for reporting.
When I run region_to_gene I often get an error instead of a result. For example: