SmartDataAnalytics / SML-Bench

A Benchmark for Machine Learning from Structured Data
Apache License 2.0
21 stars 4 forks source link

Automatically generating learning problems #22

Open Demirrr opened 3 years ago

Demirrr commented 3 years ago

Hello,

I was wondering whether SML-Bench could be used to generate a list of positive and negative examples given a knowledge base. I would be interested in generating such examples and storing it in a json file.

Cheers

SimonBin commented 3 years ago

how should it generate these positive and negative examples?

Demirrr commented 3 years ago

I reckon that there might be many options. I would humbly suggest the following ones:

  1. Randomly generate X number of concepts, each of which has a length of min=3 and max=5.. For each concept, randomly sample Y number of individuals as positive examples E^+ and randomly sample Y number of individuals from \top - E^+ .

  2. Randomly generate X number of concepts, each of which returns True for a given set of individuals. Then perform the same computation to generate E^+ and E^- as described in (1).

I could go on enumerating options. However, I am not 100% sure whether such options would be relevant to everyone.