databio / GenomicDistributions

Calculate and plot distributions of genomic ranges
http://code.databio.org/GenomicDistributions
Other
25 stars 10 forks source link

Nearest Genes Function #170

Closed nleroy917 closed 2 years ago

nleroy917 commented 3 years ago

Given a query and set of annotations, this function will calculate the nearest annotation to each region in the region set, as well as the nearest gene type and the distance to the nearest gene.

This is updated from the previous version which wasn't functioning properly!

The only issue I am having is that the function requires that the annotation file has the name of the gene defined as gene_id and the type of gene as gene_biotype. It would be nice to introduce a keyword parameter that defaults to these two, but can be changed to extract out this information should an annotation file be given with different schema or naming convention.

However, I am having a hard time attempting to dynamically access these attributes. For example:

query$gene_type = annotation[nearestIds]$gene_type

versus

query$gene_type = annotations[nearestIds][key_name]
kkupkova commented 2 years ago

The function gives error. There is definitely a typo - check "annotation" and "annotations". Please test that the function works before you give it to check. Also, I don't think it would work with GRangesList at this point (check how all the other functions include code for GRangesList objects).

nsheff commented 2 years ago

Did you confirm that this works with GRangesList objects?

nsheff commented 2 years ago

I see now this was closed in favor of https://github.com/databio/GenomicDistributions/pull/172

It's good to put a note here for future reference.