Closed KonradHoeffner closed 1 year ago
In this class there is already a logger. We can just use this for these messages. The only thing I'm not sure about is if the java command line will not break this way. My best guess is that it is used for that reason.
Can you explain better your use case, I'm still a bit confused why you use the sout to create your CSV. Can you not write to a file?
Sure, my use case is taking an existing Benchmark suite for RDF libraries without HDT and extending it with measurements of RDF libraries with HDT because we are writing an HDT library in Rust and want to know how it compares in performance to the existing libraries both with and without HDT.
If you are interested, you can see the plots for the Jupyter Notebook here.
Because the libraries use many different programming languages, the benchmark is structured like this: There is a python program that recognizes different tools and tasks, and depending on the tool and task selected, it runs this tool, which is in one of the subfolders, and runs it multiple times for each dataset size. The benchmarking subprogram for that tool in that subfolder responds with printing one line of CSV output to stdout and the Python program merges all those together in one CSV file for each tool. Then you can start Juypter Lab and generate the plots and see the scores.
I could modify everything to use Files instead but it would be a large amount of refactoring and does not fit well because each program is run many times and the benchmarking suite combines all the different results.
P.S.: Oh hi Dennis, it seems you are everywhere :-)
@ate47 thank you for this quick fix @KonradHoeffner you can checkout dev compile it and do your tests ....
an HDT implementation in Rust 😍 (so also python?)
@mielvds: Yes! I was looking for an HDT library for Rust last year and was surprised that there wasn't any on crates.io. However @timplication had one on GitHub and he allowed me to continue it under an open license.
You can try it at https://github.com/konradhoeffner/hdt or find it https://crates.io/crates/hdt. It's still under development though and doesn't have all the functions of the CPP and Java versions, i.e. only the default triple order and default HDT variant. But all the triple pattern querying and the indexes are there.
I haven't directly interfaced Rust with Python though, the Python script just executes the binary.
I am benchmarking several RDF libraries with a benchmark suite that creates CSV output, but HDT Java creates several lines of output that destroy the CSV files. It seems as if this cannot be disabled via logging as there are written directly via System.out.println. Would it be possible to disable those statements or use a logging library instead?
Here is the output of HDTManager.loadIndexedHDT:
There are several System.out.println statements in https://github.com/rdfhdt/hdt-java/blob/master/hdt-java-core/src/main/java/org/rdfhdt/hdt/hdt/impl/HDTImpl.java.