MetOffice / XBTs_classification

Project for the classification of eXpendable Bathy Thermographs
BSD 3-Clause "New" or "Revised" License
4 stars 2 forks source link

split train/test by instrument #32

Closed stevehadd closed 4 years ago

stevehadd commented 4 years ago

Currently, when splitting data intro train/test, we sample from each year, so profiles are split in the same proportion for each year. We do not take instrument into account when doing the split. We should aim to also sample from each instrument. SWo the function should iterate through each year/instrument value pair, and sample n profiles based on the train/test fractions.

stevehadd commented 4 years ago

This has now been implemented.

stevehadd commented 4 years ago

This has now been implemented in the dataset class and demomnstrated in the decision tree notebook, so closing the issue.