galaxyproject / tpv-shared-database

A shared database of rules for Total Perspective Vortex used by the usegalaxy.* federation.
MIT License
3 stars 12 forks source link

Centralized database for memory usage data #75

Open natefoo opened 1 week ago

natefoo commented 1 week ago

Right now the memory and cores in the database are a bit arbitrary - luckily, we have all the data to reason about better mem values (and cores should probably typically be mem/4 as discussed in #60 except in cases where tools use little memory but see significant speedup with more cores).

If we all pushed our memory usage and input sizes to a centralized database we could both visualize it (similar to how I have done it one-off in this gist) and hopefully automatically make some decisions about memory values in the shared DB.

However there are some things for consideration by people who are good at statistics:

sanjaysrikakulam commented 1 week ago

For example, we could use the data from the GRT once the project is resurrected. Last week, I tried to put together some thoughts (please feel free to share your feedback/suggestions) on the project so we could get a master's student working on the project.

nuwang commented 1 week ago

@natefoo Have you seen this PR? https://github.com/galaxyproject/tpv-shared-database/pull/64 It was a preliminary attempt at this. The biggest problem so far has been the inconsistency of the data in the federation (e.g. invalid cgroup metrics). But some of these issues have since been fixed I believe, so we may be able to make a fresh pass soon.

natefoo commented 1 week ago

Thanks for the heads up! I missed that.