proycon / folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
http://proycon.github.io/folia/
GNU General Public License v3.0
60 stars 10 forks source link

FoLiA v2.0 release plan #68

Closed proycon closed 4 years ago

proycon commented 5 years ago

The upcoming FoLiA v2.0 release will introduce various major changes, both to the the format itself as well as to the surrounding software infrastructure. By definition, old FoLiA 0.x and 1.x tools are not forward compatible with FoLiA 2.0, FoLiA 2.0 tools however, are, as always, backward compatible, but this compatibility is limited in the sense that these tools can read older FoLiA but will typically produce newer FoLiA if any edits are made (see also #67).

To be minimally disruptive I propose the following:

1) We release FoLiA 2.0 as stored in this repository; that includes the format, specification, and completely renewed documentation. The older v1.5 documentation is always available for reference anyway. 2) The new FoLiA-tools are released, the old one remains available (perhaps with an extra rerelease of the v1.5 versions as foliatools-legacy so we can have the two alongside, needed e.g. in LaMachine). 3) We release the new new FoLiApy library. 4) Although the new FoLiApy library replaces the old one in PyNLPl, we should not drop the library in pynlpl for the time being, allowing people to continue as is and have the two installed at the same time. This gives people a choice in whether they want to generate v1.5 or v2.0 documents, which is necessary because of transition work on other software, as described in the next points: 5) The work to implement v2 support for libfolia can start in the meantime (by @kosloot) and will take some time (2 months?). This in turn affects software like ucto and frog, which will not be FoLiA v2 compatible until completion. 6) The work to implement v2 support for foliadocserve and FLAT can start in the meantime (by @proycon) and will take some time (2 months?). In the meantime these tools simply rely on the pynlpl version, foliatools-legacy and, produce 1.5 documents. 7) Eventually, when all our software is v2 compatible. We drop the old/legacy versions. Note that the new FoLiApy library is mostly backwards compatible with the old pynlpl one anyway, so most 3rd party software shouldn't be affected much..

kosloot commented 5 years ago

I don't have insight in all changes yet. So it is hard to estimate how much time it will take to upgrade libfolia. To be safe, let us take 3 months for this. (as I will be working 2 sometimes 3 days per week on this max) It also must include improving and updating the testing environment.

proycon commented 5 years ago

Agreed

proycon commented 5 years ago

The releases are in progress, FoLiA, FoLiApy, FoLiA-tools have been released. A new LaMachine will follow shortly.

I think some extra work on foliatools is needed to prevent breakage of things that rely on producing FoLiA v1 (relates to #67), such as PICCL.