Closed ERijck closed 1 year ago
Hi,
each dataset folder contains a "metadata.json" file, with the preprocessing details (see the field preprocessing-info
). For example BBC news dataset.
As far as I remember, we selected those values by iteratively trying different values and inspecting the resulting topics.
Thanks, Silviatti!
Hi, is it possible to share the preprocessing settings steps for each dataset? E.g. What was the threshold for removing frequent/infrequent words?