-
I have worked with imbalanced datasets and created confusing matrix. The numeric based CM shows the distribution of the test data properly. However, the percentage CM is not matching with numeric CM, …
-
continuing from #79
In splitting.py, the left/right_indices_buffer will use up 8GB for 10^9 rows. If that causes swapping, the performance benefit of multithreading (which requires these buffers) a…
-
**Describe the bug**
I finished the ci/cd pipeline and run the programs on Rainbond, I find the program be stucked.I try to solve this problem and find the [#issue](https://lightgbm.readthedocs.io/en…
-
I have an imbalanced dataset (positive class rate = 1%) and have downsampled the negative class to give me a 50/50 balance in the two classes. Ignoring the challenges that comes with undersampling (l…
-
Hi,
I am trying to build a standard pipeline for tabular data that works nicely with ONNX. Ideally, the pipeline would:
1. Be based on boosted trees
2. Gracefully support mixed types (categoric…
-
In LightGBM training, you can use a pre-existing model. This does not exist yet in our implementation. We could introduce an optional input for continuous training.
-
I'm curious what folks would think about adding use case-specific pages to the Dask docs. Specifically, I was thinking about pages for machine learning and workflow orchestration where there is an esp…
-
Hi, I have a binary classification dataset where labels are sorted (I know, it's against standard ML practice to have data sorted, but the question is in the spirit of understanding Distributed LightG…
-
## Description
When we create a dataset with non default parameters, save it and load it - `construct()` breaks.
## Reproducible example
```r
library(lightgbm)
nn
-
@imatiach-msft
I have encountered some trouble when training Lambda rank model in Spark with LightGBMRanker in mmlspark.
With the same training data, training results in Spark and local machine a…