oracle / tribuo

Tribuo - A Java machine learning library
https://tribuo.org
Apache License 2.0
1.24k stars 172 forks source link

Can I use a DataFrame from Tablesaw (or another dataframe libary) ? #365

Closed MohamedLEGH closed 2 months ago

MohamedLEGH commented 2 months ago

Hello, I was wondering if I can use a dataframe from Tablesaw directly in Tribuo ? I don't find any information about it in the documentation about this. For example, with smile, I can do it directly using: moneyball.selectColumns("RD","W").smile().toDataFrame() (following this tutorial : https://jtablesaw.github.io/tablesaw/userguide/ml/Moneyball%20Linear%20regression)

If it's not the case, do you have an idea how to do it easily ? Or maybe you use another dataframe library ?

Craigacp commented 2 months ago

We don't have an integration with tablesaw, though you should be able to get the data out and into Tribuo by iterating the Rows of the table and popping them into a Map<String,String> which you can then feed to Tribuo's RowProcessor to featurise the inputs. Depending on your data source and the complexity of what you're processing in tablesaw you might be able to go directly from the data source into Tribuo via the RowProcessor which only operates on the columns that you specify.

MohamedLEGH commented 2 months ago

Ok thanks for your answer. Do you have any example or tutorials ? I have found this tutorial https://tribuo.org/learn/4.3/tutorials/columnar-tribuo-v4.html but I don't know if you have additional examples.

Craigacp commented 2 months ago

That's the one we've got for processing columnar data, I don't think we have additional tutorials on the topic.

MohamedLEGH commented 2 months ago

Ok thanks I will look into it.