plazi / treatmentBank

Repository devoted to house keeping of treatmentBank
0 stars 0 forks source link

understanding download creation and timing #22

Open punkish opened 2 years ago

punkish commented 2 years ago

@gsautter

You explained this to me before but I am confused still… please clarify for me how the creation and naming of the zip archives work. If I look today at the dumps (https://tb.plazi.org/dumps/) I see the following

Filename Description Size Last Modified
plazi.zenodeo.zip Simplified Plain XML 2548608 kb Sat, 01 Jan 2022 02:00:00 GMT+0000
plazi.zenodeo.monthly.zip Simplified Plain XML (updates since plazi.zenodeo.zip) 65 kb Sun, 02 Jan 2022 02:00:00 GMT+0000

Since it has been only day since the full archive was created, why is the second archive called "monthly" and not "daily"? From what you explained to me, I would have expected there to be a daily archive for 6 days (from full archive creation), then on the 7th day, a weekly archive would have been created, then on the 30 days there would be a monthly archive… what's going on?

gsautter commented 2 years ago

You're right, this turn-of-the-year behavior may seem a bit counter intuitive at the beginning ... however, there's a simple idea behind this: the finer granularities don't exist without the coarser ones. In particular, it means this:

This allows for a very simply download/import logic:

In particular, it saves you even checking for the daily if you don't find the weekly, etc.

The weekly and daily will become available in the next couple of days, and then things will go back to their normal course.

If you have a different idea how the naming logic should behave, please let me know ... the current one is simply what I figured might be easiest to handle for client code (using above logic) ... and I might well be wrong ...