Regardless which dataset should be provided using the scripts, there are a couple of invariants:
dataset source files need to be obtained from a server
dataset files have to converted and/or recompressed to be fed in to VirtBulkLoader
auto-indexing has to be temporarily disbaled
VirtBulkLoader has to be configured for the converted dataset files and startet
checks for errors in the VirtBulkLoader import process
re-enabling of the auto-indexing
These are tasks with dependencies among each other, that could be expressed and configured using a process-oriented build automation tool like Make, SCons, Ant or Gradle. The user would then (if all goes well) just have to invoke one call for the whole process chain of loading one DBpedia, Freebase,... and it would presumably to implement more informative error messages if stages failed and appropirate rollback/recover actions (e.g. ensure re-enabling of auto-indexing even after critical failure of the import).
Just a small Python command line tool might also be a good alternative for this use case and it's scope. (Perl and Ruby would work fine also, but Python is maybe preferable due to it's degree of familiarity)
Regardless which dataset should be provided using the scripts, there are a couple of invariants:
These are tasks with dependencies among each other, that could be expressed and configured using a process-oriented build automation tool like Make, SCons, Ant or Gradle. The user would then (if all goes well) just have to invoke one call for the whole process chain of loading one DBpedia, Freebase,... and it would presumably to implement more informative error messages if stages failed and appropirate rollback/recover actions (e.g. ensure re-enabling of auto-indexing even after critical failure of the import). Just a small Python command line tool might also be a good alternative for this use case and it's scope. (Perl and Ruby would work fine also, but Python is maybe preferable due to it's degree of familiarity)