src-d / ml

sourced.ml is a library and command line tools to build and apply machine learning models on top of Universal Abstract Syntax Trees
Other
141 stars 44 forks source link

Split sourced-ml package to algorithms and data collection parts #396

Open zurk opened 5 years ago

zurk commented 5 years ago

Dependent projects such as https://github.com/src-d/style-analyzer need only algorithms part of the sourced-ml: https://github.com/src-d/ml/tree/master/sourced/ml/algorithms

Data collection part uses deprecated jgit-spark-connector which depends on old packages. This leads to unpleasant dependency conflicts: https://github.com/src-d/style-analyzer/pull/719/files#diff-354f30a63fb0907d4ad57269548329e3R30

That is why we should split the package into two parts.

Guillemdb commented 5 years ago

I am currently trying to make sense of the src-d/ml package, and I am building a map of how the different files depend on each other. I will be updating it during this week.

I hope that when it's finished it helps splitting the package in two.

src-d/ml files

Edit link

The different colors mean the following: