urinieto / jams

A JSON Annotated Music Specification for Reproducible MIR Research
BSD 2-Clause "Simplified" License
13 stars 0 forks source link

JAMS doesn't type-check data written to various fields. #24

Open rabitt opened 10 years ago

rabitt commented 10 years ago

Loading a jam with sandbox fields causes an error in line 375 of pyjams.py For an example try: jam = pyjams.load('jams/datasets/RockCorpus/1999.jams')

ejhumphrey commented 10 years ago

So turns out there are two bugs, neither of which having anything to do with loading a real Sandbox:

  1. The RockCorpus parser sets the sandbox field as a string, instead of a dict. On load, the JAMS object acknowledges that this is wrong and complains (throwing this error).
  2. The JAMS object needs to make sure that its writing sane data to disk.

(1) is fixed, renaming this issue to reflect (2).

ejhumphrey commented 10 years ago

After looking into this a bit, it seems that there is no robust / sane way to do this without rolling our own schema validator. This is partly due to the need to unpack referenced definitions in the schema. If this was just a look-up, it wouldn't be so bad ... but these kinds of cross-references make the process non-trivial. I think it goes without saying that we're all against doing this on our own.

There is at least one library that looks pretty promising: http://python-jsonschema.readthedocs.org/en/latest/validate/

It'd break the zero dependencies target, but that's a somewhat arbitrary goal. We could always make built-in validation optional, but it seems like a necessary feature given how quickly we ran into this problem.

Additionally, we might want to talk design and implementation on this topic, as in where are datatypes checked: this would probably be preferable at the set attribute level, but this might be too atomic; data-type checking on write is would probably be easier to implement (because it can check the whole object for correctness), but this might make it really annoying to use / backtrack errors.

Changing this to an enhancement.