Closed FlorentinD closed 3 years ago
@lessthanoptimal What do you think of caching the generated random matrix and loading it from disk? For my personal benchmarks I realized that the random matrix generation takes most of the time and thus I could reduce the runtime by quite a bit.
go for it!
Any idea why Github doesn't seem to think the last PR is merged? I'm seeing changes for both. I'll check back in a little bit and see if it fixed itself.
I needed to rebase the PR, now there should it should be fixed
Just tried the caching approach, but there seems to be no measurable benefit. I assume the matrices are too small to make a difference. Also I saw you deprecated the binary loading anyway, and I highly doubt loading via csv is faster than generating the random matrix.
If its of any interest at some point, the branch is at https://github.com/FlorentinD/ejml/tree/cacheCSCMatrices.
Could add a new binary file format if that really is a lot faster.
Its actually interesting for why its use is strongly advised against. Apparently it can be used to launch attacks like executing arbitrary code or something like that.
As I said, for these smaller benchmarks it doesn't seem to be worth. In my benchmarks the matrices were a magnitude larger. So at the moment I don't think it's worth.
Didn't know that about loading binary objects , but sounds interesting.
was thinking of the other benchmarks where you saw improvement
If you plan to run some bigger benchmarks on sparse matrices, it is probably worth it
@lessthanoptimal Fyi, I ran into a similar issue where I needed to load large sparse datasets for tests/benchmarks, and ended up implementing DMatrixSparseCSC serialization with MATLAB's .mat
file format (Mat5EjmlTest). The binary storage format uses CSC as well, so it can be implemented very efficiently.
Probably not worth it for small datasets though.
@ennerf Any chance you could donate that .mat serialization to EJML?
actually looks like it's its own project! I'm wondering if we talked about this before. Well I'll provide a link to it from EJML's website.
@lessthanoptimal we briefly talked about this during the work on sparse solvers a few years ago
I think it's probably better to keep it as a separate project for the time being. I'm not opposed to moving parts into EJML proper, but I'd prefer to first think about a better plugin mechanism to make the integration more seamless. Let me know in case that is something you'd potentially like to pursue.
Thanks for the link!
@FlorentinD if you look on the front page you'll see the link as well on the now updated MatrxIO page.
How about adding MatrixIO.loadMatlab()
then use reflection to call your function? If the class doesn't exist print an error message saying how the dependency can be added? The error would link to a website that says how to add the dependency.
based on #119. Introduces benchmarks for the masked vxm and mxv.