Almost each chemical database provides SDF or XML export of its data. In connection to issue #4 make a memory-efficient parser that will extract only the attributes of interest and passes it on for further processing.
The loader must be capable of processing XML and SDF files and must be easily extensible (for example we discover a new chemical attribute that could be mined or we need to load the data from a new file type).
Almost each chemical database provides SDF or XML export of its data. In connection to issue #4 make a memory-efficient parser that will extract only the attributes of interest and passes it on for further processing.
The loader must be capable of processing XML and SDF files and must be easily extensible (for example we discover a new chemical attribute that could be mined or we need to load the data from a new file type).