uncomplicate / neanderthal

Fast Clojure Matrix Library
http://neanderthal.uncomplicate.org
Eclipse Public License 1.0
1.07k stars 56 forks source link

Core.matrix basic integration #20

Closed mikera closed 8 years ago

mikera commented 8 years ago

This PR implements basic core.matrix compatibility for Neanderthal, sufficient to pass the core.matrix compliance test suite using the CBLAS implementation.

Optimised operations are not yet implemented for all protocols, but this PR should serve as a base on which to implement the optimised function set.

whilo commented 8 years ago

Nice work! Great to see integration between neanderthal and core.matrix move forward :+1:

blueberry commented 8 years ago

Thank you for taking time to work on this integration, Mike. I hope lots of users will find it useful.

As for the PR, I'll have to put my decision about integrating it into the project on hold. As I said previously in our discussions, I believe that the best place for such integration is a separate library, at least until it is polished enough so that the performance is not hurt, and users actually find it indispensable.

mikera commented 8 years ago

@blueberry What are your criteria / reservations regarding inclusion?

core.matrix is now becoming an official part of Clojure Contrib and I think it would be better to have the integration within Neanderthal so that it works "out of the box" without requiring a separate library. Just trying to make everything as simple as possible for users...

If we do go the route of a separate library, would you be willing to take a PR that includes this library in the Neanderthal test suite for regression testing purposes at least?

blueberry commented 8 years ago

My primary concern is that I'd be stuck with maintaining lots of code that I do not use, and that I am not sure works well regarding the overhead/mismatch on top of Neanderthal.

As you said, this is just a preliminary, experimental version, that is not even complete. I don't have problem with that, since it still can be useful for many people. But, if I include it in Neanderthal, I'd be left to explain users why their code does not work well or fast as advertised, or they could think that Neanderthal is slow in some circumstances when everything is fine with Neanderthal, but the integration is broken/incomplete. Of course, I'd also have to update it whenever I update Neanderthal. Since most users do not read documentation carefully, whenever there is problem with the integration, they'll think Neanderthal sucks. That can be a lot of work, that takes time and energy when there is lots of places where I should be adding additional functionality, and I have limited resources.

Isn't it the same as with Vectorz? Vectors is a stand alone library, and then there is vectorz-clj, which is a core.matrix adapter?

I also think that the right place for the testing suite is in that separate library, but I can offer to run it manually every time I intend to release Neanderthal, and warn you if it fails.

mikera commented 8 years ago

I think I can alleviate your concerns.

If a user raises an issue, just assign it to me. I'm happy to answer questions and make performance enhancements where relevant. This is an open source project after all: you are under no obligation to answer or update everything yourself.

Regarding code changes, there shouldn't be much risk of breakage unless you make breaking changes to the Neanderthal APIs, which I assume would be rare. In general, I have designed this integration to depend only on your public API and the definition of the key Matrix / Vector interfaces. Changes here may cause some breakage but it should be trivial to detect and fix. Again, feel free to assign these to me.

I'm pretty familiar with this kind of work and it is very quick to make updates. For comparison, I spend comparatively little time on vectorz-clj, usually just bumping the Vectorz dependency occasionally. This is because any implementation changes generally happen in the underlying Vectorz library without changing the public API. The only reason vectorz-clj is separate from Vectorz is because Vectorz is designed as a pure Java library - if it was a Clojure library then I would have included the core.matrix integration directly (as I am proposing for Neanderthal).

Overall, I think integrating this code into Neanderthal will be less work for both of us (and certainly less confusing for users) than maintaining, testing and releasing a separate library. I think you have done some great work with Neanderthal (I see it pretty much as an improved, drop-in replacement for Clatrix which is a bit crufty and doesn't offer as much raw performance) so I'm keen to make this integration happen. I also understand you won't necessarily use this part of the code yourself, but I think it is worth doing for the sake of better library interoperability in the ecosystem as a whole.

blueberry commented 8 years ago

But still, core.matrix will not be included in clojure, you still need to put it as a separate dependency in project.clj, and, I guess, the neanderthal's integration story could be the same as vectorz's.

I will be open to reconsider that once the integration becomes stable and low-overhead. Until then, I think it's more convenient if is developed as an independent project.