Closed atdservicebot closed 5 years ago
Server ran out of space again today. I cleared sent XMLs and old backup CSVs. Issue resolved.
@sergiogcx lots of easy fixes for this, i know. would you take a look as your time allows?
Definitely. It would be great to find out how you are monitoring the server status.
In the mean time, we could set up a cron to upload old xml/csv files to s3, and to clear the disk.
I am thinking we could set up the same cron job with CloudWatch or SNS alarms, or set up something more sophisticated such as Zabbix or Zenoss (there are containers ready-to-go)
@sergiogcx longer term, I know we want to use a monitoring service, but I've changed the scope of this issue to address near-term concerns. Can you find some time to either (1) set up logrotation to remove old files from these locations or (2) setup a simple bash script that does this on a cron schedule?
@johnclary This is done, just needs your review:
For the data
folder:
~/maintenance-scripts/transportation-data-publishing-datacsv.sh
For the xml/sent
folder:
~/maintenance-scripts/transportation-data-publishing-dataxml.sh
For testing, just run (it will only print the files that are going to be deleted):
bash ~/maintenance-scripts/transportation-data-publishing-datacsv.sh
bash ~/maintenance-scripts/transportation-data-publishing-dataxml.sh
Crontab (every day at mid-night, and five after midnight):
sudo crontab -l
0 12 * * * bash (the script path for datacsv will show here)
5 12 * * * bash (the script path for dataxml will show here)
Review, and please edit line 44:
nano +44 ~/maintenance-scripts/transportation-data-publishing-datacsv.sh
nano +44 ~/maintenance-scripts/transportation-data-publishing-dataxml.sh
Remove the comment that prevents actual deletion:
# rm -f $DATA_FILE;
My understanding is that crontab runs from root's home directory, tested the script execution from there and it lists all files to be deleted.
Also: do we care to log the files that have been deleted?
@sergiogcx beautiful. the test ran successfully and i uncommented the rm
statement.
i do not care to log the CSV files, but logging the XMLs makes sense to me. would you just keep a log in /maintenance-scripts
?
We have few processes that write data to our scripting server (
atd-data01
). We do not have any automated cleanup to delete/rotate these files.The locations of concern are:
/home/publisher/atd-data-publishing/transportation-data-publishing/xml/sent
This directory contains files of XML messages that were send to the ESB. We have never had a situation where we need to restore these records .There's no need to keep records that are more than 1 week old.
/home/publisher/atd-data-publishing/transportation-data-publishing/data
This directory contains copies of our Data Tracker records as CSVs. Knack provides a restore service—so these serve as an added precaution. It is very rare that we've needed these, but it does happen. We can remove records that are more than one week old.