RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
34 stars 9 forks source link

`binlog` Files in `/var/lib/mysql` Take Up a Ton of Space #336

Open ecwood opened 11 months ago

ecwood commented 11 months ago

While working on #321, SemMedDB's extraction ran out of space on the instance to dump the MySQL output. This is bizarre given the size of the instance. However, after checkout out the /var/lib/mysql directory, I was surprised to learn that it was taking over 600G of space. This was due to the binlog.* files, which took up 400G of disk space. Those are created all the time. After deleting those, the MySQL database was back to a far more reasonable 213G. According to https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#option_mysqld_log-bin, there should be a way to disable these files from being created.

ecwood commented 11 months ago

We can't close this issue out yet because, while trying to build using the merged JSON Lines code (#321), 100G worth of binlog.* files were created despite having

[mysqld]
skip-log-bin

in the mysql-config.conf file.

ecwood commented 11 months ago

I suspect the problem is populate_mysql_db_configured.sh in the umls directory.