Closed ains-arch closed 6 months ago
I was going to suggest all of the things you just included in your edit. So it seems to me like you're on the right track.
Log files shouldn't be consuming much disk space (if everything is working correctly). But if you are concerned about it, you can disable logging for a container by following these instructions: https://stackoverflow.com/questions/34590317/disable-logging-for-one-container-in-docker-compose.
I'm trying to add the millions of rows to my prod database. I'm making my own dataset, rather than using the twitter dataset, so I think I may be doing a bad job of managing how the data is stored. I have taken down the containers and removed the volumes and restarted my process, but I've run into
-bash: cannot create temp file for here-document: Disk quota exceeded
problems twice.I have run the
du -hd1
command in my home directory, and the issue seems to be due to the size of the.local/share/docker
directory, specifically thecontainers
andoverlay2
directories.Here is my prod dockerfile:
It seems like I'm currently mounting the database to the bigdata folder, that doesn't count to my disck quota. The other volumes should be very small.
I think the problem is maybe in the size of the logs. My random data generation and insertion code is very... bad, and the way I'm handling the constraints of the database is to try to add the random data and if it breaks uniques or foreign key constraints, to roll it back. But when I look at
it has every single failed insertion attempt in there and is, therefore, huge. I tried to find where the logs are stored in the docker container's command line by doing
on the container with the prod database, but it doesn't seem like there's anything in
/var/log/postgresql
and I don't know where else they would be.Would appreciate any tips on how to handle clearing the logs, or disk usage in general, or if I just need to rewrite my data insertion script so that it doesn't constantly throw and ignore errors.
cc: @westondcrewe https://github.com/mikeizbicki/cmc-csci143/issues/561#issuecomment-2094942254
update I took down the containers and removed my volumes because I wasn't close to 10 million rows anyway. That directly reclaimed 1.727GB and seems to have indirectly cleared the
containers
folder. There was still 5G in.local/share/docker/overlay2
so I also randocker system prune
and I'm just crossing my fingers that deleting networks and images to reclaim another 2.05 GB didn't break anything.I would still really like to know how to avoid getting to this point again, especially if I'm right that part of my problem is the log files.