WIP : python3-ization and type hinting

pyannote / pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

http://pyannote.github.io/pyannote-metrics

MIT License

186 stars 33 forks source link

WIP : python3-ization and type hinting #37

Closed hadware closed 2 years ago

hadware commented 4 years ago

Same thing as for pyannote-core.

First question: for what you call "array-like", i'm proposing 3 solutions:

create this type and use it everywhere : ArrayLike = Union[Sequence, np.ndarray]
only use Sequence
only use np.ndarray

As for now, it doesn't look like there is anything usable for type-hinting array sizes (after a quick research). Ref: https://mypy.readthedocs.io/en/latest/cheat_sheet_py3.html#standard-duck-types (for Sequence)

hbredin commented 4 years ago

Happy new year @hadware !

Thanks for this pull request and sorry for the delay in looking at it. Looks like it does not pass the tests. Can you please fix that when you find the time?

hadware commented 4 years ago

Yes I will! I'll be here on next tuesday @ENS by the way.

hadware commented 2 years ago

Well, this should be close to done.

Regarding my former question: there's a nifty ArrayLike type in numpy.typing that does the job nicely.

hadware commented 2 years ago

Fixed your comments. I'm wondering: in some cases (notably: for all detection metrics), the required type for ref/hypothesis seems to be Annotation (because it then calls uemify which "tacitly" only accepts Annotation instances). Shouldn't uemify be able to also process Timeline instances?

Moreover, maybe we could extend the use of collar and skip_overlap (used in IER) to other metrics?

hbredin commented 2 years ago

Fixed your comments. I'm wondering: in some cases (notably: for all detection metrics), the required type for ref/hypothesis seems to be Annotation (because it then calls uemify which "tacitly" only accepts Annotation instances). Shouldn't uemify be able to also process Timeline instances?

It could be applied to Timeline instances but only do the cropping part, but not the common timeline part. This is because, Timeline instances are mostly used for segmentation metrics and we don't want to artificially split long segments into shorter ones.

Moreover, maybe we could extend the use of collar and skip_overlap (used in IER) to other metrics?

Once again, we'd have to be careful about segmentation metrics as these two notions probably don't make sense here.

Anyway, this should not be part of this PR.

Shall I merged this PR or do you plan to add more typing?

hadware commented 2 years ago

Ok, maybe in another PR, it'll indeed take some time to look into it.

Regarding this current PR, I think i'm mostly done with typing, i'll re-check everything, and it should be good.

hbredin commented 2 years ago

C'est votre dernier mot, Jean-Pierre ?

hadware commented 2 years ago

hbredin commented 2 years ago

🎉 Thanks @hadware !