We now have several instances of corpora that are in fact version of the same data: the HamleDT data (which has been replaced by Universal Dependencies) and the PDT data (in versions PDT 2.5, PDT 3.0, and PDiT 2.0; the last one is the newest).
In other cases, the older versions have simply been removed from the web interface, which might cause problems with reproducibility of some older research (e.g., we have just handed in a paper with links to some PML-TQ queries; if the corpus is later removed, these "permanent" links become broken, right?).
It would be good to be able to keep older versions "running but hidden" (accessible only from links that people have stored as bookmarks etc., but not listed in "Browse treebanks"), and a user landing on an older version of the corpus should be warned that a newer version exists.
We now have several instances of corpora that are in fact version of the same data: the HamleDT data (which has been replaced by Universal Dependencies) and the PDT data (in versions PDT 2.5, PDT 3.0, and PDiT 2.0; the last one is the newest).
In other cases, the older versions have simply been removed from the web interface, which might cause problems with reproducibility of some older research (e.g., we have just handed in a paper with links to some PML-TQ queries; if the corpus is later removed, these "permanent" links become broken, right?).
It would be good to be able to keep older versions "running but hidden" (accessible only from links that people have stored as bookmarks etc., but not listed in "Browse treebanks"), and a user landing on an older version of the corpus should be warned that a newer version exists.
Related: For the Lindat repository: https://github.com/ufal/clarin-dspace/issues/412#issuecomment-346624992 For KonText: https://github.com/ufal/lindat-kontext/issues/59