timmahrt / praatIO

A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given hierarchical time-aligned transcriptions (utterance > word > syllable > phone, etc).
MIT License
311 stars 33 forks source link

More idiomatic json format #40

Closed timmahrt closed 1 year ago

timmahrt commented 2 years ago

The json format exported by praatio largely mirrors the textgrids, in terms of the data that is output.

As requested on a different github project, (https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/453) there isn't really any need to be bound by the textgrids. We should structure the json files to be their own thing.

Details can be found in the above link.

timmahrt commented 1 year ago

@mmcauliffe Sorry to ping you. It's a bit late but I have a PR up with the new json format as we discussed about half a year ago 😅. If you have any concerns, please let me know. I was thinking of putting out the release next weekend or so.

timmahrt commented 1 year ago

I've cut a release (that includes the changes to the json format)--praatio 6.0. I opened this issue, so I'll close it now, but feel free to reopen if there any comments.

mmcauliffe commented 1 year ago

Awesome, thanks for this! Sorry I didn't have a chance to look over it previously, but looks good to me, I saw the PR for updating to 6.0 on conda-forge, so going to be going through MFA making sure everything's compatible.