mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
2.99k stars 400 forks source link

Please document all preprocessing methods #713

Open gdevenyi opened 5 months ago

gdevenyi commented 5 months ago

Digging into the code, it looks like mljar tries various reprocessing techniques on the data, however none of the documentation covers what it attempts.

pplonski commented 5 months ago

Hi @gdevenyi,

You are right. We need to improve docs. Have you been able to find required information about preprocessing in the code? What is your use case?

@Bocianski this might be good issue to start contributing to mljar-supervised :+1:

Bocianski commented 5 months ago

Yes sir 🫡 @pplonski

gdevenyi commented 5 months ago

Have you been able to find required information about preprocessing in the code? What is your use case?

I think I did, I was looking at how you handle categorical data in X. Particularly if you try out different encoding schemes.

gdevenyi commented 5 months ago

(I think you do this right now, https://github.com/mljar/mljar-supervised/issues/367 )