refgenie / refgenieserver

Serves a web interface and RESTful API for reference genome assets.
http://refgenie.databio.org
BSD 2-Clause "Simplified" License
13 stars 3 forks source link

archive doesn't populate the config if archive exists #94

Closed nsheff closed 3 years ago

nsheff commented 4 years ago

I've released a source of some of my earlier trouble with missing assets on the server.

If I run the archiver, and then want to change some things and re-run it to recreate the config, what I do is something like:

  1. run archvier
  2. delete config file
  3. run archiver again
  4. try to upload config

but if the archiver detects an archive is there, it doesn't rebuild it, and then doesn't correctly populate the config. so the config file above ends up incorrect (doesn't have asset hashes).

should we make it so that if you re-run the archive you can at least repopulate the config, even if you don't recreate the archive? recompute the hashes for example? maybe make it an option? just a thought. otherwise to re-init the config I have to delete all archives and start over.

stolarczyk commented 4 years ago

I see. The scenario that you describe in my mind is far less frequent than:

  1. run archiver on a large set of assets (takes couple of hours)
  2. build a new asset
  3. run archiver on the newly build one (takes couple of minutes)

The archiver was "optimized" towards this scenario -- incremental archive updates. Recalculating hashes would make the 3. step last really long for every incremental update.

I still think this is what the default behavior should be. We have a --force option in refgenieserver archive command, but this would actually build the archive. Will need to add a "softer" --force, to recalculate hashes, but skip the archives creation. What do you think?

nsheff commented 4 years ago

Yeah, that makes sense -- or, it could actually just recalculate the archive digests only if they don't exist, by default. This would work for both scenarios, with no additional option, wouldn't it?

stolarczyk commented 4 years ago

yes, and that's probably the best solution

stolarczyk commented 4 years ago

it's implemented on dev if you wish to use this feature