OHDSI / MedlineXmlToDatabase

A command line Java application for parsing MEDLINE XML files and inserting the data into a relational database
Apache License 2.0
19 stars 11 forks source link

update should process XML files in order #2

Closed rkboyce closed 7 years ago

rkboyce commented 9 years ago

Hi,

I used this code to create a local PubMed instance in December 2014 and think that it is great. This is my first go at using the code to update the database. It's great that it seems as simple as just running the code with -parse and the path to the folder containing the update records. However, one thing I notice based on the output to stdout is that the code does not process those XML files in order. This seems to be a potential problem. I think that the recommended approach is to process them in order of release from the NLM which can be inferred on the last four digits of the file name. Might be a simple change but I don't know where in the code to make it.

thanks, -R

schuemie commented 9 years ago

Hi Rich,

Yes, processing the XML files in order is absolutely essential. It always did on my Windows machine, but it was not guaranteed to do so.

I added a few lines of code (1ad758a18f9ea1f520792fb93e131691812493fa) that will guarantee the files are processed in order. Can you test it?

Thanks!