Open Yuan-ManX opened 1 year ago
I would also be happy to see the data set. In addition, it would be nice to have the training code available. In my test runs the results seem to be sensitive depending on how the staffs are cropped from a larger image. I would think that this could be improved by adding more distortions to the training set.
will the dataset be made public?
From what I can see, the data set was never published. While I still hope that this might change in the future, I started an attempt to train this model on a mix of the PrIMuS data set and the Grandstaff data set. The results aren't as robust yet as what I get with the weights provided in this repo, but in some cases it works well. I put the training code so far on my fork of this repo: https://github.com/liebharc/Polyphonic-TrOMR
From what I can see, the data set was never published. While I still hope that this might change in the future, I started an attempt to train this model on a mix of the PrIMuS data set and the Grandstaff data set. The results aren't as robust yet as what I get with the weights provided in this repo, but in some cases it works well. I put the training code so far on my fork of this repo: https://github.com/liebharc/Polyphonic-TrOMR
You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results?
You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results?
I haven't calculated a symbol error rate yet. Right now, I run the inference on a small set of example images, such as https://github.com/BreezeWhite/oemer/blob/main/figures/tabi.jpg (after splitting it into single staff images) to get a feeling on how well it performs.
Is the code you are using to calculate the SER available somewhere? To get meaningful results, I'd also need another data set to calculate the SER on. Since PrIMuS is used for the training, I can't of course also use it to rate the performance of the results. At least for monophonic examples, it shouldn't be too hard for me to find another data set.
You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results?
I haven't calculated a symbol error rate yet. Right now, I run the inference on a small set of example images, such as https://github.com/BreezeWhite/oemer/blob/main/figures/tabi.jpg (after splitting it into single staff images) to get a feeling on how well it performs.
Is the code you are using to calculate the SER available somewhere? To get meaningful results, I'd also need another data set to calculate the SER on. Since PrIMuS is used for the training, I can't of course also use it to rate the performance of the results. At least for monophonic examples, it shouldn't be too hard for me to find another data set.
I will open-source my code once everything is ready, but currently, it's still under development. You can calculate the symbol error rate by measuring the edit distance between the predicted sequence generated by the computational model and the ground truth. You can use the command 'pip install editdistance' to install the tool for calculating the edit distance.
Regarding the dataset, I trained the model using approximately 60,000 images from the PrIMuS dataset and then tested it on around 10,000 images. I also experimented with training on a smaller scale of images and found that TrOMR may not fully demonstrate its true capabilities when the dataset size is small.
Is the dataset open source? How to download?