rbroc / echo

A Scalable and Explainable Approach to Discriminating Between Human and Artificially Generated Text
https://cc.au.dk/en/clai/current-projects/a-scalable-and-explainable-approach-to-discriminating-between-human-and-artificially-generated-text
2 stars 1 forks source link

Create script to split data for classifiers and save to folder #70

Closed MinaAlmasi closed 1 month ago

MinaAlmasi commented 2 months ago

Currently, I'm splitting data with the same function with the same seed (with a default seed set) across multiple scripts (whenever I'm classifying etc.). However, to avoid any potential mishaps, lets just save the data (esp. since we are also creating PCA models based on training data ONLY - a lot of steps where things can go wrong if we are splitting data all the time).

MinaAlmasi commented 1 month ago

Closing with #73