josiah-wolf-oberholtzer / discograph

Social Graphing for the Discogs Database
MIT License
74 stars 11 forks source link

Explode Discogs XML files into per-record files. #16

Closed josiah-wolf-oberholtzer closed 9 years ago

josiah-wolf-oberholtzer commented 9 years ago

Name the files by their intended class and Discogs ID.

The incredibly massive files provided are cumbersome. If any part of an import fails, the file must be scanned up to that point. This process can takes hours and hours. By providing individual files per record, it should be easier to simply sort and search the file system, compare against the current contents of the database, and continue from there.

josiah-wolf-oberholtzer commented 9 years ago

It's not clear whether or not this will help. An overriding concern is that separating out individual XML files does generally cause us to run out of disk space on the current dev machine.

josiah-wolf-oberholtzer commented 9 years ago

Not necessary.