Currently, when splitting data intro train/test, we sample from each year, so profiles are split in the same proportion for each year. We do not take instrument into account when doing the split. We should aim to also sample from each instrument. SWo the function should iterate through each year/instrument value pair, and sample n profiles based on the train/test fractions.
Currently, when splitting data intro train/test, we sample from each year, so profiles are split in the same proportion for each year. We do not take instrument into account when doing the split. We should aim to also sample from each instrument. SWo the function should iterate through each year/instrument value pair, and sample n profiles based on the train/test fractions.