marl / jams

A JSON Annotated Music Specification for Reproducible MIR Research
ISC License
186 stars 27 forks source link

rewrite unit tests #26

Closed bmcfee closed 9 years ago

bmcfee commented 9 years ago

... since the schema's all different, and much of the (python) codebase has been simplified or rewritten, this is probably necessary.

@ejhumphrey Any objection to switching from unittest.TestCase objects (old and busted) over to nosetest functions (new hotness)? This will make tests easier to write, and play nicely with #25 .

bmcfee commented 9 years ago

Sketching out a plan for test design:

urinieto commented 9 years ago

Nosetests +1

This test list is.... well, I have no words, I'm so impressed.

On Thu, Feb 26, 2015 at 10:27 AM, Brian McFee notifications@github.com wrote:

Sketching out a plan for test design:

  • JObject
    • serialization, deserialization
    • equivalence testing
    • dict-like methods
    • search
    • Sandbox
    • just test for construction; all else inherits from JObject
    • JamsFrame
    • construction, conversion
    • add_observation
    • interval_values conversion
    • serialization (sparse and dense)
    • Annotation
    • construction
    • append
    • Curator
    • construction
    • AnnotationMetadata
    • construction
    • FileMetadata
    • construction
    • AnnotationArray
    • construction
    • search
    • serialization
    • JAMS
    • construction
    • load
    • save
    • add
    • validate
    • pyjams functions
    • load
    • append
    • load_lab
    • Utilities
    • timedelta_to_float
    • serialize_obj (should be implicit from JamsFrame tests)
    • read_lab (do we still need this?)
    • load_textlist
    • expand_filepaths
    • smkdirs
    • filebase
    • find_with_extension
    • match_query (semi-redundant with JObject.search test)
    • query_pop (also semi-redundant with JObject.search test)

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-76197370.

bmcfee commented 9 years ago

This test list is.... well, I have no words, I'm so impressed.

save being impressed for after we've implemented it!

ejhumphrey commented 9 years ago

sounds great; any preferred examples to share of this in practice? or any ole google search will do?

On Thu, Feb 26, 2015 at 10:32 AM, Brian McFee notifications@github.com wrote:

This test list is.... well, I have no words, I'm so impressed.

save being impressed for after we've implemented it!

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-76198316.

bmcfee commented 9 years ago

any preferred examples to share of this in practice?

nose is super easy to use, but librosa has tons of (not so clean) examples

bmcfee commented 9 years ago

In the process of implementing tests, I've been doing a little refactoring and simplification.

Specifically:

As such, I've moved the lab->annotation loader from pyjams into util, where it makes more sense.

So, the question for all of you is:

urinieto commented 9 years ago

We do need a function to convert labs to jams. As long as it works, I don't care whether we use the current read_lab or not (sorry, is that what you're asking?).

bmcfee commented 9 years ago

Yes; to clarify, we currently have two:

I'm advocating dropping read_lab, but if there's a compelling reason to include our own csv/tsv->array parser, I'm open to keeping it.

urinieto commented 9 years ago

I'd rather not use a custom implementation if another one already exists in, e.g., pandas.

On Sat, Mar 7, 2015 at 6:15 PM, Brian McFee notifications@github.com wrote:

Yes; to clarify, we currently have two:

  • read_lab reads csv directly, and returns a flat array. This function exists in the initial jams release.
  • import_lab uses pandas csv loader, requires a target namespace, and constructs an Annotation object. Optionally, it can append directly into an existing JAMS. I added this function to streamline the process and work around some of the inevitable quirks of csv parsing.

I'm advocating dropping read_lab, but if there's a compelling reason to include our own csv/tsv->array parser, I'm open to keeping it.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-77715778.

justinsalamon commented 9 years ago

pandas +1

bmcfee commented 9 years ago

Coming back to this thread now.

What are people's thoughts on dropping load_textlist as well? It's a one-liner that gets used twice in the (old) import scripts. It sucks an entire text file into an array of strings, in memory.

I'd like to drop it, since it's A) simple enough to reimplement as needed (with increased clarity in the resulting code), and B) kind of a bad pattern, since it could crash out of memory if executed on a very large file. I think, in all cases, we can replace it with calls to pandas csv/table parsers with no loss of functionality.

ejhumphrey commented 9 years ago

++

On Wed, May 13, 2015 at 2:40 PM, Brian McFee notifications@github.com wrote:

Coming back to this thread now.

What are people's thoughts on dropping load_textlist as well? It's a one-liner that gets used twice in the (old) import scripts. It sucks an entire text file into an array of strings, in memory.

I'd like to drop it, since it's A) simple enough to reimplement as needed (with increased clarity in the resulting code), and B) kind of a bad pattern, since it could crash out of memory if executed on a very large file. I think, in all cases, we can replace it with calls to pandas csv/table parsers with no loss of functionality.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-101771823.

urinieto commented 9 years ago

++ too

On Wed, May 13, 2015 at 2:52 PM, Eric J. Humphrey notifications@github.com wrote:

++

On Wed, May 13, 2015 at 2:40 PM, Brian McFee notifications@github.com wrote:

Coming back to this thread now.

What are people's thoughts on dropping load_textlist as well? It's a one-liner that gets used twice in the (old) import scripts. It sucks an entire text file into an array of strings, in memory.

I'd like to drop it, since it's A) simple enough to reimplement as needed (with increased clarity in the resulting code), and B) kind of a bad pattern, since it could crash out of memory if executed on a very large file. I think, in all cases, we can replace it with calls to pandas csv/table parsers with no loss of functionality.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-101771823.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-101775569.

justinsalamon commented 9 years ago

:+1:

bmcfee commented 9 years ago

Do we really need the 'jams.append()` function? It seems to do too much: load, append, save.. I would prefer to factor out the load/save (already done) and just have a single function that appends one jam to another.

(Again, this is in the interest of simplifying the API and unit tests.)

justinsalamon commented 9 years ago

+1

urinieto commented 9 years ago

I would also support this. However, I would like to sit down with you because I'm lost about the state of things with JAMS, and I would really like to contribute.

On Mon, May 18, 2015 at 4:27 PM, Justin Salamon notifications@github.com wrote:

+1

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-103199801.

bmcfee commented 9 years ago

Ok folks:

I've started pushing up some namespace validation tests. They're super easy to write, but there are a lot of them, so help would be appreciated.

Existing examples are to be found in jams/tests/namespace_tests.py.

Some basic guidelines:

And, because I'm such a benevolent dictator, I made homework problems for everyone!


Assignments

bmcfee commented 9 years ago

In the course of writing namespace tests, I uncovered a fun bug in the serialization logic.

JObject.json was (intentionally) skipping over fields whose values evaluated to False, which includes empty strings and zeros. When you serialize the object, this omits several fields from the resulting dictionary structure, including things like FileMetaData.release and Curator.email. Then, when you validate the entire object, since these fields are required, it barfs.

I've fixed it so that even empty fields are returned in the structure serialization. This inflates the object on disk somewhat, but improves safety since we can guarantee that all fields are defined (if not populated).

urinieto commented 9 years ago

Nice.

On Thu, May 28, 2015 at 11:16 AM, Brian McFee notifications@github.com wrote:

In the course of writing namespace tests, I uncovered a fun bug in the serialization logic.

JObject.json was (intentionally) skipping over fields whose values evaluated to False, which includes empty strings and zeros. When you serialize the object, this omits several fields from the resulting dictionary structure, including things like FileMetaData.release and Curator.email. Then, when you validate the entire object, since these fields are required, it barfs.

I've fixed it so that even empty fields are returned in the structure serialization. This inflates the object on disk somewhat, but improves safety since we can guarantee that all fields are defined (if not populated).

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-106397189.

urinieto commented 9 years ago

I just realized that the JAMS don't validate now if the duration of the file doesn't exist in the file_metadata field. What if we don't really have this info? (e.g., it is missing in some of the SALAMI songs listed in the metadata.csv file).

On Thu, May 28, 2015 at 11:38 AM, Oriol Nieto oriol.nieto@gmail.com wrote:

Nice.

On Thu, May 28, 2015 at 11:16 AM, Brian McFee notifications@github.com wrote:

In the course of writing namespace tests, I uncovered a fun bug in the serialization logic.

JObject.json was (intentionally) skipping over fields whose values evaluated to False, which includes empty strings and zeros. When you serialize the object, this omits several fields from the resulting dictionary structure, including things like FileMetaData.release and Curator.email. Then, when you validate the entire object, since these fields are required, it barfs.

I've fixed it so that even empty fields are returned in the structure serialization. This inflates the object on disk somewhat, but improves safety since we can guarantee that all fields are defined (if not populated).

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-106397189.

bmcfee commented 9 years ago

What if we don't really have this info? (e.g., it is missing in some of the SALAMI songs listed in the metadata.csv file).

You should be able to set it to np.nan in this case. But don't all of the salami annotations have end-of-track event markers?

urinieto commented 9 years ago

Unfortunately, these end-of-track event markers are not always correct (e.g., see track number 10).

On Thu, May 28, 2015 at 7:33 PM, Brian McFee notifications@github.com wrote:

What if we don't really have this info? (e.g., it is missing in some of the SALAMI songs listed in the metadata.csv file).

You should be able to set it to np.nan in this case. But don't all of the salami annotations have end-of-track event markers?

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-106628317.

bmcfee commented 9 years ago

Unfortunately, these end-of-track event markers are not always correct (e.g., see track number 10).

I looked at track 10 and don't entirely understand the problem?

[~/data/SALAMI/data/10]➔ for i in *.txt ; do echo -n "$i  " ; tail -1 $i ; echo ; done
textfile1.txt  314.269342403    End
textfile2.txt  314.232267573    End

Yes, many of the annotations give different end time, but they're pretty close. Why not just take the max of candidate end times as an estimate?

urinieto commented 9 years ago

Oh yes, you're right. I misread the annotations. Ok, so I will go ahead and add the max estimate to the duration field in the metadata in the SALAMI / Isophonics parser.

On Fri, May 29, 2015 at 9:21 AM, Brian McFee notifications@github.com wrote:

Unfortunately, these end-of-track event markers are not always correct (e.g., see track number 10).

I looked at track 10 and don't entirely understand the problem? Yes, many of the annotations give different end time, but they're pretty close. Why not just take the max of candidate end times as an estimate?

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#issuecomment-106801035.

justinsalamon commented 9 years ago

Do we have any jams files that validate against the latest schema kicking around anywhere? Or a quick way to randomly generate some? Sheesh, it's been a while...

update: just found jams/tests/fixtures/valid.jams

bmcfee commented 9 years ago

Do we have any jams files that validate against the latest schema kicking around anywhere? Or a quick way to randomly generate some?

The SMC examples should validate. There's also one copied over into test/fixtures that I use in the generic unit tests.

If you want to generate random ones, check this for an example of how. For the namespace tests, you only need to make Annotation objects, not full JAMSes.

justinsalamon commented 9 years ago

Trying to write a test for pitchhz. Neither of the following two validate (I wasn't sure how to build a dense annotation since pitch* seems to be the only case?):

def test_ns_pitch_vz_valid():

    ann = Annotation(namespace='pitch_hz')

    for time in np.arange(5.0):
        ann.append(time=time, duration=0.0, value=100.0, confidence=None)

    ann.validate()

And:

def test_ns_pitch_vz_valid():

    ann = Annotation(namespace='pitch_hz')

    seq_len = 5
    time = np.arange(seq_len)
    duration = np.zeros(seq_len)
    hz = np.random.uniform(-22050., 22050., seq_len)
    confidence = np.array([None]*seq_len)
    ann.append(time=time, duration=duration, value=hz, confidence=confidence)

    ann.validate()

Any ideas?

bmcfee commented 9 years ago

Super weird; I'm getting the same thing.

If I turn off strict validation, I get the following warning:

In [18]: ann.validate(strict=False)
/home/bmcfee/git/jams/jams/pyjams.py:775: UserWarning: 'duration' is not of type u'object'

Will investigate later...

justinsalamon commented 9 years ago

Not sure if related, but when I create a pitch_hz annotation, the duration field gets converted into days. Could it be that duration is converted into some time-object that is not allowed by a dense annotation?

bmcfee commented 9 years ago

Not sure if related, but when I create a pitch_hz annotation, the duration field gets converted into days. Could it be that duration is converted into some time-object that is not allowed by a dense annotation?

No, that's just how timedelta fields get repr'd when the value is 0 and there's no other values to set an appropriate scale (ie seconds). The objects do serialize correctly.

You're right that it has to do with dense-vs-sparse serialization though:

In [6]: ann.data.dense = False

In [7]: ann.validate()
Out[7]: True

I think this is entirely down to the validate method, and not a problem with the serialization itself. I'll dig into it though.

bmcfee commented 9 years ago

I think this is entirely down to the validate method, and not a problem with the serialization itself. I'll dig into it though.

aha! Yep. We are implicitly assuming a SparseObject serialization that can bet iterated over record-wise in this logic. This is easily fixable.

bmcfee commented 9 years ago

17d510f should fix this error via some list comprehension wizardry.

justinsalamon commented 9 years ago

pitch_hz allows negative values due to the "x<=0 implies unvoiced" convention. Do we want to keep this convention for pitch_midi? I'm not sure how an actual midi synth would handle negative note values, and it definitely doesn't use "0==silence". On the other hand, if we disallow these cases, then conversion between pitch_hz and pitch_midi becomes non-trivial.

bmcfee commented 9 years ago

Do we want to keep this convention for pitch_midi?

I would say no, since negative midi numbers are technically possible. For midi-type pitch annotations, I would expect silence to be indicated by the lack of annotation.

justinsalamon commented 9 years ago

but how would the lack of annotation be represented in a dense jams?

bmcfee commented 9 years ago

but how would the lack of annotation be represented in a dense jams?

Dense events still have duration, so I don't see a problem here.

justinsalamon commented 9 years ago

Dense events still have duration, so I don't see a problem here.

In the case of melody (pitch_hz) since the events happen on a uniform time grid the duration values are not expected to carry any meaningful information (they can all be 0 for example) - so you can't really use the duration to encode silence for dense annotations (at least not the way they are set up right now)

bmcfee commented 9 years ago

[documenting offline conversation with @justinsalamon]

bmcfee commented 9 years ago

@rabitt can you link the site you mentioned for generating positive/negative examples from a regexp?

rabitt commented 9 years ago

@bmcfee http://fent.github.io/randexp.js/

bmcfee commented 9 years ago

I took the liberty of finishing the pattern_jku tests...

and we've got full tests! Closing this one out.

urinieto commented 9 years ago

Thanks titan!

On Mon, Aug 17, 2015 at 10:37 AM, Brian McFee notifications@github.com wrote:

Closed #26 https://github.com/marl/jams/issues/26.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/26#event-384300778.