WGLab / Project_Belka

2 stars 0 forks source link

create an internal train/test split to assess different prediction approaches #1

Closed kaichop closed 3 weeks ago

kaichop commented 3 weeks ago

take 80% of the train.csv and create a new newtrain and take 20% and create a newtest file, so that we can evaluate different approaches on known results. This should be done in a random fashion rather than taking the first 80% lines.

umahsn commented 3 weeks ago

Completed.

umahsn commented 3 weeks ago

Located here: /mnt/isilon/wang_lab/shared/Belka/raw_data

Euchiz commented 3 weeks ago

I generated the parquet version of the splits to the same location.