IQVIA-ML / LightGBM.jl

Julia FFI interface to Microsoft's LightGBM package
Other
93 stars 10 forks source link

Adding support for constructing Dataset from SparseMatrix #113

Closed yaxxie closed 2 years ago

yaxxie commented 3 years ago

As title:

yaxxie commented 3 years ago

@kd-iqvia do you consider this sufficient for an initial take for sparse matrix support?

The tests aren't exhaustive (e.g. I don't demonstrate that you can in fact mix dense/sparse matrices for the train test, and I haven't covered a huge range of scenarios for checking the sparse inputs are correctly received, there is currently no support for predictions given sparse matrices), but it looks like it works. Is there anything you want to see or would need to see before merge?

yaxxie commented 2 years ago

@FatemehTahavori Is there anything you think is missing prior to merge?

kainkad commented 2 years ago

@yaxxie thank you so much for this PR. Adding support for SparseMatrix predictions and the MLJ integration can be done at a later stage. I know the tests at the moment are not exhaustive to more use cases but you showed a higher level integrity of the LGBM_DatasetCreateFromCSC function you added (yes it doesn't crash :D) and also another test for fitting model on dense and sparse train inputs and compared predictions are within a margin of tolerance so I think it's enough for the initial take on supporting sparse matrices. We might be having some more use cases with the data we work with so we might just enhance the test cases as some point.