[x] Currently using "hashmap" object for gene ID mapping (inst/extdata/uniprotid_to_hgnc) and "hash" object for InWeb, protein family, SNP-to-gene mapping, etc. Should we be consistent and just use one type of hash?
[x] What to store as RData under data/ vs. external files under inst/extdata?
SNP-to-gene mapping feature:
[x] The data stored in old Genoppi GitHub ("snp_to_gene.RData") is a hash object with GENES as keys (i.e. gene -> SNP mapping, not vice versa). Accompanying code (in server.R) looks up each gene in proteomic data to see if their associated SNPs are included in the user-defined SNP list. *** I've implemented get_snp_list using this version for now. Adding the RData object really seems to slow everything down... (e.g. when running devtools::check() and test())
[x] Would it be better to reimplement the hash object to do direct SNP -> gene mapping? This way get_snp_list and get_gwas_list would be more consistent with the other functions for processing different types of overlay data. *** check if storing this hash object would be computationally expensive (as there are many more SNPs than genes)
[ ] Probably future to do: May be useful to make several versions of the mapping data using different reference panels (and add parameter in get_snp_list to specify reference panel used for mapping)?
re-run and document all data.