etf-validator / etf-webapp

:earth_africa: :mag: ETF is an open source testing framework for spatial data and services
https://www.etf-validator.net
European Union Public License 1.2
19 stars 19 forks source link

Removing old data from public ETF instance #173

Closed michellutz closed 4 years ago

michellutz commented 6 years ago

We are having issues again with the storage space available in our public ETF instance (see this recent comment).

We have already a regular cron job running that removes all the files older than 15 days inside the folder .etf/testdata.

However, doing some more digging in the other data folders described in 3.3. ETF data directory structure of the Admin manual, we found that the following ones still contain vast amounts of data:

According to the documentation,

I assume that, if we want to keep only the test reports for the last x (say 15) days available in the application, we can also remove any sub-directories and attachments in these folders that are older than 15 days. Correct?

Furthermore, I am wondering if we could "clean up" data also in the other folders, e.g.

carlospzurita commented 6 years ago

Anyhow, the ETF is configured by default to execute a job every midnight, checking the date of the TestRun files and deleting those that are older than the number especified in the etf/config/etf-config.properties file.

michellutz commented 6 years ago

@df-git Could you pls have a look at the config files and the logs to see if the TestRun files are deleted as they should or not.

carlospzurita commented 6 years ago

If you can provide us with the log files, and the ETF folders that you have in your environment, we can also take a look at it.

carlospzurita commented 6 years ago

Looking at the ETF logs that were provided, it seems like the scheduled task for cleaning test objects and results is working every night, and reports that a number of files were cleaned. There are a couple of TestResult that are reported as non-existant:

Can you check if those files are still in the etf folders? Also, can you rebuild and redeploy the application, but changing the properties

to 480?This will put an expiration date of 8 hours, ensuring that everything gets cleaned at midnight. This way you can monitor everyday what are the files that are not deleted.

In the meantime we are checking in our premises the behaviour, and we will get back to you as soon as we have more information.

carlospzurita commented 6 years ago

We checked on our premises, and it seems that the even thought the references are removed from the BaseX storage, the files asociated with the test runs in ds/attachements are not removed. Did you executed the cleanup on your deplyment? how is it now?

jonherrmann commented 4 years ago

Closing this in favour of: