issues
search
turbomam
/
biosample-xmldb-sqldb
Tools for loading NCBI Biosample into an XML database and then transforming that into a SQL database
MIT License
0
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
sparsity analysis of NCBI Biosamples via relational database created by this repo
#42
turbomam
closed
6 months ago
2
database competency questions
#41
turbomam
opened
7 months ago
1
check for capital letters in column names... may require double quotes in queries
#40
turbomam
opened
8 months ago
0
More FKs or views
#39
turbomam
opened
8 months ago
0
More FTS indices on tables other than NCBI Biosample data
#38
turbomam
opened
8 months ago
0
record current DDL
#37
turbomam
opened
8 months ago
0
NCBI MIxS class mappings
#36
turbomam
closed
8 months ago
0
what does "either_one_mandatory" mean as a NCBI Package Attribute use value?
#35
turbomam
opened
8 months ago
0
Add INSDC missing value table
#34
turbomam
closed
8 months ago
2
Add INSDC country code list
#33
turbomam
closed
8 months ago
2
MIxS combination to NCBI Package mapping
#32
turbomam
opened
8 months ago
0
NCBI Package/Attribute use
#31
turbomam
closed
8 months ago
1
add taxonomy data, minimally a list of metagenome ids
#30
turbomam
closed
8 months ago
5
Document process of restoring a SQL dump residing on a NERSC DTN, into a Spin-hosted Postgres container
#29
eecavanna
opened
8 months ago
0
Document process of retrieving a dump from `dtn01.nersc.gov` and restoring it into a Spin-hosted Postgres server
#28
eecavanna
opened
8 months ago
0
Include an alignment between GOLD Biosample IDs and NCBI Biosample IDs
#27
turbomam
opened
8 months ago
9
give pivot_harmonized_attributes.py a click CLI and `[tool.poetry.scripts]` entry
#26
turbomam
closed
8 months ago
0
try pre-building the destination pivot table with all harmonized name columns
#25
turbomam
closed
8 months ago
0
try AWS
#24
turbomam
closed
8 months ago
5
crashed during pivot_harmonized_attributes.py. Out of RAM (on host?)
#23
turbomam
closed
8 months ago
0
there is a mixture of empty string and NULL values in our Postgres tables
#22
turbomam
closed
8 months ago
0
"/srv/basex/shared-chunks/biosample_set_from_37000001.xml" (Line 20346097): XML document structures must start and end within the same entity.
#21
turbomam
closed
8 months ago
3
Incomplete XQuery results need to be trimmed before `copy`ing into Postgres
#20
turbomam
closed
8 months ago
0
Add fts (full text search) indices for some of the columns in the non_attribute_metadata Postgres table?
#19
turbomam
closed
8 months ago
0
Integrate more data, like a SQL representation of MIxS and the EnvO semantic SQL database
#18
turbomam
opened
9 months ago
6
What could we use as a delimiter between concatenated values besides `|||`
#17
turbomam
opened
9 months ago
0
Request for PRJNA656268 metadata from Adam Martiny via Montana
#16
turbomam
opened
9 months ago
2
Document reasonable Docker settings (CPU, RAM, storage, swap)
#15
turbomam
opened
9 months ago
0
Look for parallelization opportunities
#14
turbomam
opened
9 months ago
0
Make sure the containers are never CPU or RAM limited
#13
turbomam
opened
9 months ago
0
Use host mounts for the BaseX and Postgres data directories
#12
turbomam
opened
9 months ago
1
Use this as a starting point for instantiating NMDC Biosamples
#11
turbomam
opened
9 months ago
0
Add code, URLs, screenshots etc to "Simon’s half-baked way of looking for relevant metagenomes in SRA"
#10
turbomam
opened
9 months ago
6
Explain that NCBI's XML is chunked because there's a cap on the number of nodes in a BaseX database.
#9
turbomam
closed
8 months ago
1
Provide support for writing XQueries
#8
turbomam
closed
8 months ago
0
More documentation for accessing/using the BaseX web interface
#7
turbomam
closed
8 months ago
0
Add support for dumping the Postgres contents in a form that can be easily loaded into another Postgres server
#6
turbomam
opened
9 months ago
0
Find a better GH org for this repo
#5
turbomam
opened
9 months ago
0
Repair of MIxS triad values (esp. from EnvO)
#4
turbomam
opened
9 months ago
0
Repair of MIxS checklist/extension like values
#3
turbomam
opened
9 months ago
6
better capture and use of BioProject identifers
#2
turbomam
opened
9 months ago
0
improve chaining of `pre-basex-all`, `basex-all` and `postgres-all` targets
#1
turbomam
closed
8 months ago
1