lexibank / pylexibank

The python curation library for lexibank
Apache License 2.0
18 stars 7 forks source link

Segments #233

Closed LinguList closed 3 years ago

LinguList commented 3 years ago

This PR proposes a rather quick solution when working with lexibank datasets. A single function "valid_sequence" is integrated into the function that adds the form. If this evaluates to False, the form is judged to be wrongly coded, containing an error, and will be listed among the first 100 forms that do not work.

This means, one should always make sure that there are really no forms shown, even if the CLTS is "100%"!

xrotwang commented 3 years ago

Can we get the tests to pass?

LinguList commented 3 years ago

@xrotwang, this is the problem: the tests work on my machine, but I had to adjust to the most recent clts version. This, however, has not yet been released (neither the pyclts). So what should we do? Do we need to release pyclts in some way?

xrotwang commented 3 years ago

I think this is the price to pay for decoupling development. So yes, if you need new features of CLTS, you have to make a release. I tend to limit my virtualenvironments to one package installed in development mode to prevent such problems from creeping in. It's a bit more work in the cases where functionality needs to be synched between packages, but the conceptual transparency makes up for this.

LinguList commented 3 years ago

Well, as we NEED to do this anyway, release the new CLTS, it was anyway inevitable, right? What we can also do for now is revert back the modifications I made in pkg, have the tests pass, and then later modify them. Unfortunately, there was some heavy modifications in clts recently, but I hope this to be over soon... At least the feature system should now be stable.

xrotwang commented 3 years ago

Ok, so it seems that a new CLTS release - as well as a new pyclts release will both need to be new major versions, right? So then, I'd even advocate making a pylexibank release before updating, then releasing CLTS and pyclts and resume work on pylexibank.

xrotwang commented 3 years ago

Oh, and the pylexibank release before upgrading CLTS should have appropriate conditions in setup.py, e.g. 'pyclts>=2.0,<3'.

LinguList commented 3 years ago

The time plan, which I hope we can keep would be to have the clts/pyclts release done next week or second week of December. I'd like to add two more transcription datasets, and have a thorough check of pyclts.

LinguList commented 3 years ago

ah, and reverting the clts prevents that this fails.