google-research / robopianist

[CoRL '23] Dexterous piano playing with deep reinforcement learning.
https://kzakka.com/robopianist/
Apache License 2.0
542 stars 45 forks source link

PIG Dataset Issue Tracker #7

Open kevinzakka opened 1 year ago

kevinzakka commented 1 year ago

The PIG dataset has a few issues:

We need to fix these issues or find all the affected songs so that they can be excluded from the benchmark score calculation.

NeroBlackstone commented 9 months ago

Not only Wrong tempo and Inconsistent sustain. There are many other types of errors in the PIG Dataset. Including fingerings that can't be played physically, like Crossed Chord. This has been discovered in earlier PIG data analysis.

Please read Checklist Models for Improved Output Fluency in Piano Fingering Prediction

Considering that classical music in PIG has entered the public domain, I think we should establish an open-source fingering dataset to facilitate collaboration among researchers to correct errors in the dataset and improve the quality of fingering annotations for data-driven methods.

Another weakness of PIG is that it is not organized in a common format and is difficult to parse.

Maybe we could convince the authors (Nakamura) to open-source the PIG dataset. And recruit volunteers to maintain it.

PS: I read your paper, Really amazing work!

kevinzakka commented 9 months ago

Thanks for the reference @NeroBlackstone! Agreed regarding maintaining and open-sourcing PIG. In an ideal world, the agent can discover its own fingering, in which case no need for PIG, but alas exploration is hard :)

NeroBlackstone commented 9 months ago

In an ideal world, the agent can discover its own fingering, in which case no need for PIG, but alas exploration is hard :)

If the player is a robot (like this work), the pig data set may not be needed like you said.

Enumeration action space + invalid action masking, and setting some optimization goals, it's enough to let the agent discover its own "best fingering".

But it's difficult to optimize multiple goals. There may even be conflicts between optimization goals. Moreover, the fingering results are offen not suitable for human player.

My conclusion is that human-labeled dataset (like PIG) is essential for learning initial expert policy. Otherwise, the fingering results will be difficult to benefit humans.

It's essential to maintain PIG for generating human playable fingering.

NeroBlackstone commented 9 months ago

In an ideal world

I don't think there is such a perfect environmental model. Even if we can use 3D models to simulate all the details of the hand.

Because there's one thing that can't be simulated - the feeling when we playing.

In other words, humans make fingering decisions based on their perception of the difficulty of the fingering.

This “feel” must be extracted using human-annotated datasets.