Closed kaichop closed 3 weeks ago
take 80% of the train.csv and create a new newtrain and take 20% and create a newtest file, so that we can evaluate different approaches on known results. This should be done in a random fashion rather than taking the first 80% lines.
Completed.
Located here: /mnt/isilon/wang_lab/shared/Belka/raw_data
/mnt/isilon/wang_lab/shared/Belka/raw_data
I generated the parquet version of the splits to the same location.
take 80% of the train.csv and create a new newtrain and take 20% and create a newtest file, so that we can evaluate different approaches on known results. This should be done in a random fashion rather than taking the first 80% lines.