Closed fnpdaml closed 1 year ago
hi @fnpdaml :
to use your own datasets, you want to check out / modify read_file
in read_file.py
: https://github.com/cavalab/srbench/blob/4cc90adc9c450dad3cb3f82c93136bc2cb3b1a0a/experiment/read_file.py
if your datasets follow the convention of https://github.com/EpistasisLab/pmlb/tree/master/datasets, i.e. they are in a pandas dataframe with the target column labelled "targert", you can call read_file
directly just passing the filename like you would with any of the PMLB datasets.
read_file
is called in evaluate_model
here: https://github.com/cavalab/srbench/blob/4cc90adc9c450dad3cb3f82c93136bc2cb3b1a0a/experiment/evaluate_model.py#L39
hope that helps
Hi and thanks for that again! (I also use the above account interchangeably)
I had to do 3 things:
Indeed generate my data in a pandas dataframe with the target column labelled "target"; [I'm fitting a groundtruth: "eq1.tsv"]
Then it seemed easier just to mimic the PMLB layout - created the corresponding "metadata.yaml" and "summary_stats.tsv" files;
Compress my data to "eq1.tsv.gv" - and only now it worked.
But a few issues with "analyze.py":
"--time_limit" seems to have no control on the time it is allowed to run. (set 5min, had to abort after 1 day)
My data was a 2nd order equation and was not discovered. However, several methods seemed to converge on the same, but different (from the original) answer - is there normalization going on, and where is this controlled and recorded?
Many thanks!
(bump)
evaluate_model
: https://github.com/cavalab/srbench/blob/47da695292938d5e696ddcd4252f4034330ef787/experiment/evaluate_model.py#L24
Hi there, Some quick start help needed:
After installing, how to run the benchmarks on user-supplied data? I'm struggling getting this work - to make sure there's nothing wrong with my SRBench install:
Many thanks!