urinieto / jams

A JSON Annotated Music Specification for Reproducible MIR Research
BSD 2-Clause "Simplified" License
13 stars 0 forks source link

Utility functions for reading labfiles (parsers) #13

Closed ejhumphrey closed 10 years ago

ejhumphrey commented 10 years ago

Currently, the isophonics parser does all of its labfile reading inside the function that also populates the jam. This concerns me for a few reasons:

  1. Functions should generally be as atomic as possible, i.e. do one thing.
  2. Most parsers are going to need labfile reading, which is going to create a lot of redundant code.
  3. A properly bullet-proof labfile reader is, unfortunately, non-trivial, to catch the various tab-vs-whitespace and hidden characters that can break a parser ("\r\n").

I'd propose creating a utility module that provides commonly used methods for the parsers. Ideally these would be tested, but the big gain is the reuse of functions.

Uri, I'll hang this on you for the time being, since you've already made decisions in this area. What are your thoughts on how to proceed?

urinieto commented 10 years ago

+1

When I saw your isophonics parser I thought: "damn, I should've done it like that". Maybe it would be easy to remove my isophonics parser and put yours instead (with all its necessary files), and make it as an example to the rest of the parsers that still need to be written?

ejhumphrey commented 10 years ago

Okay, well let's all put a pin in that. I just wrote a util file (motivating a python package rather than a module, interestingly enough) with this functionality, so I'll commit this soon and close the issue. I wasn't sure if you had some good reason for doing it the way you did.

urinieto commented 10 years ago

Sounds good. No reason beyond having it already written for MSAF.

rabitt commented 10 years ago

Is it worth just using mir_eval's lab parsers and then going from there to jams format?

ejhumphrey commented 10 years ago

My $0.02USD ... In the future, maybe, but for the time being, no. I think we should minimize dependencies as much as possible. Right now, I think we might even be pure python. (EDIT: yes, right now we have zero dependencies, which rocks)

ejhumphrey commented 10 years ago

also, these are committed / in the repository now.

worth mentioning, and perhaps forking into a new / separate issue... we should harmonize which, if any, fill_xxx functions go in util. There's a trade-off here between reuse and clarity to new users.

ejhumphrey commented 10 years ago

In cleaning the isophonics parser, this is done as of b62c84bc597f.