maciejkula / spotlight

Deep recommender models using PyTorch.
MIT License
2.99k stars 428 forks source link

classification : synthetic unbalanced data generating #180

Open Sandy4321 opened 4 years ago

Sandy4321 commented 4 years ago

may you share some links to synthetic unbalanced data generating for classification when your code is for recommendation system data https://maciejkula.github.io/spotlight/datasets/synthetic.html

meaning close to real data - with mix of categorical and continues features values in addition to known simple one https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html weights : array-like of shape (n_classes,) or (n_classes - 1,), (default=None) The proportions of samples assigned to each class. If None, then classes are balanced. Note that if len(weights) == n_classes - 1, then the last class weight is automatically inferred. More than n_samples samples may be returned if the sum of weights exceeds 1

or maybe your code can be used for binary classification with mix of categorical and continues features values when different group of features have complicated dependency?