We store the gene (and soon transcript) level augmented feature information in the same directory that the data is stored in (getOption("archs4.datadir")).
Currently the gene-level metadata was just copied from the one in the GenomicsStudyDb package, but those data were generated based off of the GENCODE-basic annotations, but we probably want to recreate these from the full ensembl transcript files.
We should be able to create an arsh4-specific feature table by first parsing the ensembl transcript identifiers from the transcript-level hdf5 files. Then roll them up to ensembl gene id's with their associated gene symbol, then map those ensembl-derived gene symbols to the organism_matrix.h5 gene-level count files.
We store the gene (and soon transcript) level augmented feature information in the same directory that the data is stored in (
getOption("archs4.datadir")
).Currently the gene-level metadata was just copied from the one in the
GenomicsStudyDb
package, but those data were generated based off of the GENCODE-basic annotations, but we probably want to recreate these from the full ensembl transcript files.We should be able to create an arsh4-specific feature table by first parsing the ensembl transcript identifiers from the transcript-level hdf5 files. Then roll them up to ensembl gene id's with their associated gene symbol, then map those ensembl-derived gene symbols to the
organism_matrix.h5
gene-level count files.