Applied-GeoSolutions / gips

Geospatial Image Processing System
GNU General Public License v3.0
17 stars 5 forks source link

handle duplicate assets (of differing versions) #475

Closed ircwaves closed 6 years ago

ircwaves commented 6 years ago

This handles the case of duplicate assets of differing versions by keeping the greatest versioned asset and pitching the others. I suppose that we could ignore the others and just return the file path for the latest and greatest, but I'm not feeling hoarder at the moment.

ra-tolson commented 6 years ago

Should this be aimed at dev instead of master?

ra-tolson commented 6 years ago

I appreciate wanting to trash files that are probably just cruft taking up space in the archive. That said I think it would surprise users if we removed files on any given gips-run, even though most of the time old versions of assets are probably okay to delete.

What do you think about moving this action to its own command? Like add a command that shows you what you can safely remove, and then some options for the cleaning itself:

You could also issue a warning on discovery of duplication that says something like "X duplicate assets discovered, remove with gips_clean."

ircwaves commented 6 years ago

Re: PR target

I guess it could go to dev, but master was puking due to old files being left in the archive. At the moment, GIPS doesn't support keeping multiple versions of assets around. I suppose just filter-ing the files list down to one might allow that and have addressed the blocker. Or it could be addressed via a maintenance script, but this PR does what needed to be done to our archive for master gips to work for prism.

Re: gips clean (aka gips_janitor)

This is an idea that has been kicking around. There are many things that we'd want a janitor script to handle. With respect to deleting old files out from under users, when the condition occurs that there are two versions of an asset, that GIPS driver is unusable for that scene. More message logging is a good idea.

Re: ORM

That makes sense. As we are moving toward making the database required, this is worth an issue.