scicloj / scicloj.ml

A Clojure machine learning library
Eclipse Public License 2.0
214 stars 14 forks source link

Incompatibility with recent versions of tablecloth/tech.ml.dataset #15

Closed respatialized closed 1 year ago

respatialized commented 1 year ago

Hi there! I am repeatedly encountering strange compatibility issues and compilation errors when trying to use scicloj.ml alongside the most recent version of tech.ml.dataset.

Here's how to reproduce them.

clojure -Sdeps '{:deps {scicloj/scicloj.ml {:mvn/version "0.2.2"} scicloj/tablecloth {:mvn/version "7.000-beta-51"}}}' -e "(require '[scicloj.ml.metamorph])"
> Syntax error (ClassNotFoundException) compiling at (scicloj/ml/smile/nlp.clj:1:1).
smile.nlp.normalizer.SimpleNormalizer

Leaving aside for now the question of why a newer version of t.m.d would cause this particular classpath issue, the fix for this one is relatively straightforward: add the missing library to the classpath. This, however, prompts another issue - a missing function in tech.ml.dataset itself.

clojure -Sdeps '{:deps {scicloj/scicloj.ml {:mvn/version "0.2.2"} scicloj/tablecloth {:mvn/version "7.000-beta-51"} com.github.haifengl/smile-nlp {:mvn/version "2.6.0"}}}' -e "(require '[scicloj.ml.metamorph])"
> Syntax error compiling at (scicloj/ml/metamorph.clj:1928:3).
No such var: tech.v3.dataset.metamorph/select-rows-by-index

It's unclear why this function was removed - I haven't had the time to read through the repo history in detail.

Generally speaking, it seems like this and other issues could be addressed through automated testing of the repo. I would be happy to write and test a GitHub Actions definition that adds an automated test suite to the project to check for compatibility issues like this.

behrica commented 1 year ago

I have not yet updated scicloj.ml (more precisely scicloj.ml.smile) to latest tmd (v. 7.x), as by today "tablecloth" is not yet available in v7.x. I will do so, once tablecloth is available in v 7.x and propose to re-check the issue then.

behrica commented 1 year ago

This branch contains the needed updtaes for scicloj.ml.smile: https://github.com/scicloj/scicloj.ml.smile/tree/tc7.0 for the latest beta of tablecloth 7.x

I have not checked if other parts of scicloj.ml still pull in older version of tech.ml.dataset. But you can try.

behrica commented 1 year ago


It's unclear why this function was removed - I haven't had the time to read through the repo history in detail.

TMD 7.0 removed quite some dependencies (or expects the user to provide them), so I needed to add them in `scicloj.ml.smile`
behrica commented 1 year ago

v0.3 of scicloj.ml is out. It depends now on TMD 7.007, so I think this issue can be closed. Depending on scicloj.ml 0.3 gives you now tablecloth and TMD 7.007