RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
38 stars 8 forks source link

`binlog` Files in `/var/lib/mysql` Take Up a Ton of Space #336

Open ecwood opened 1 year ago

ecwood commented 1 year ago

While working on #321, SemMedDB's extraction ran out of space on the instance to dump the MySQL output. This is bizarre given the size of the instance. However, after checkout out the /var/lib/mysql directory, I was surprised to learn that it was taking over 600G of space. This was due to the binlog.* files, which took up 400G of disk space. Those are created all the time. After deleting those, the MySQL database was back to a far more reasonable 213G. According to https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#option_mysqld_log-bin, there should be a way to disable these files from being created.

ecwood commented 1 year ago

We can't close this issue out yet because, while trying to build using the merged JSON Lines code (#321), 100G worth of binlog.* files were created despite having

[mysqld]
skip-log-bin

in the mysql-config.conf file.

ecwood commented 1 year ago

I suspect the problem is populate_mysql_db_configured.sh in the umls directory.