Closed sabjoslo closed 6 years ago
There's another issue with the random subsample. To calculate topic contribution properly, we need a list of how many instances come from each year and that's not very clear for the first few years when we're sampling randomly. If you could add an output to your sampling function that includes those counts, I could finish up the analysis and write to all collaborators with the results. Thanks!
@BabakHemmatian, see ecbc10f.
See title.