probonopd / MiniDexed

Dexed FM synthesizer similar to 8x DX7 (TX816/TX802) running on a bare metal Raspberry Pi (without a Linux kernel or operating system)
https://github.com/probonopd/MiniDexed/wiki
1.11k stars 81 forks source link

Not all sysex files get loaded #455

Closed donluca closed 1 year ago

donluca commented 1 year ago

A huge amount of sysex files which are correctly recognized and loaded on Dexed are being written off as "not supported" in MiniDexed.

I got a big (3000+) collection of sysex files for the DX7 and loaded them onto MiniDexed but only ~80 of them are actually getting loaded.

I've quickly tried them in Dexed and most of them are correctly recognized and loaded. Others give a warning about not recognizing them and giving the option to load as "random data", but it actually correctly loads them.

EDIT: here's the entire collection converted to the MiniDexed format, ready to go: sysex.zip

donluca commented 1 year ago

I think I've found (one of?) the culprit: this line here limits the number of patches to load to 127.

What happens probably is that MiniDexed hits this limit and discards everything else.

@probonopd is there any particular reason why this limit was set to 127?

Banana71 commented 1 year ago

Isn't it so that all sysex files are loaded into ram at startup?

donluca commented 1 year ago

Are there ways to manually load sysex files manually?

I've changed that line and now it is correctly loading all the sysex files but it is taking literally forever.

I'd prefer to load them "on demand" rather than loading them all at startup.

It is loading 4 sysex patches per second which means that to load all of my 3200 patches it is going to take a whopping 800 seconds, more than 13 minutes!

EDIT: 9 minutes!

Banana71 commented 1 year ago

I prefer a fast system start. ⌛😴💤

donluca commented 1 year ago

I don't mind waiting something like 10-15 seconds, but 9 minutes is just nuts and, on top of that, I can access only 129 Banks of the 3000 loaded?

Something is fishy...

EDIT: and on top of that some of the banks loaded have just empty patches... yeah, there's definitely something wrong happening here which needs to be investigated.

probonopd commented 1 year ago

I think I've found (one of?) the culprit: this line here limits the number of patches to load to 127.

What happens probably is that MiniDexed hits this limit and discards everything else.

Indeed.

@probonopd is there any particular reason why this limit was set to 127?

I can only speculate as to why we are only loading 127 banks at the moment, but I suspect it has to do with the (mis-?)conception that one can only select 127 banks via MIDI messages. It's actually 16383 when one uses LSB and MSB.

We can just increase that number and see what happens.

Is it really useful in any sense to allow more than that anyway? Those huuuuge collections of patches all over the net seem to be very redundant and one can't actually find anything in them anyway, can one?

donluca commented 1 year ago

Well, Minidexed is honestly amazing at managing lots and lots of banks thanks to the rotary encoder, I can scroll through hundreds of banks without issues, so while you're right that in those mega-collection there's probably a lot of overlapping, why shouldn't we allow the user to have as many sysex files as he wants?

Besides, there are more serious issues at hand which I've raised here: some sysex files are not loading correctly, they show as banks with no name and have all their voices empty.

Unfortunately I have no clue why this happens as all the patches I've tested load and play fine on Dexed.

I'm afraid that I've condensed way too many issues into a single thread, but being discussed here, for easier reading are the following issues:

  1. MiniDexed limited to loading 127 banks
  2. Even when MaxVoiceBankID is increased, in Minidexed menu only 129 banks are available
  3. All sysex files are loaded into RAM at the start. This means that boot times are greatly increased. With 3200 sysex files it takes 9 minutes to boot. There should be a way to load sysex files "on demand".
  4. While Minidex says that a sysex file has been correctly loaded, some of those are shown with no name in the menu and contain only empty voices, which means that they have not been correctly loaded

Should I open separate issues?

EDIT: oh actually it seems like Issue 2 has already been taken care of by #395 by @abscisys , correct?

probonopd commented 1 year ago

MiniDexed limited to loading 127 banks#

I lifted that limit in the lastest continuous build.

Even when MaxVoiceBankID is increased, in Minidexed menu only 129 banks are available

Have you tried to change banks >127 using MIDI commands?

All sysex files are loaded into RAM at the start. This means that boot times are greatly increased. With 3200 sysex files it takes 9 minutes to boot. There should be a way to load sysex files "on demand".

Indeed, https://github.com/probonopd/MiniDexed/blob/main/src/sysexfileloader.cpp would need to be changed to work on individual files rather than whole directories. And https://github.com/probonopd/MiniDexed/blob/main/src/uimenu.cpp would need to be changed to reflect the CSysExFileLoader changes.

Right now the number of available banks that can be used (CSysExFileLoader::GetNumLoadedBanks), e.g., through MIDI bank switch commands, is determined in sysexfileloader.cpp after having loaded all available banks. Instead, it should probably just return the number of files with the .syx extension in the directory, regardless of whether they can actually be loaded or not. Once the user requests a particular .syx to be loaded and it turns out it cannot be loaded, we could load a bank of INIT patches instead.

@diyelectromusic wdyt?

While Minidex says that a sysex file has been correctly loaded, some of those are shown with no name in the menu and contain only empty voices, which means that they have not been correctly loaded

Can you make a list of (some of) those? Can we find out in which ways the non-working ones are different from the working ones?

donluca commented 1 year ago

I haven't tried changing banks with MIDI commands, but, generally speaking, I'd like to do all the voice selection directly on the MiniDexed rather than relying on external hardware. I edited my message above because I've seen that #395 has possibly already taken care of that, so maybe that's one less thing to worry about.

Tomorrow I'll zip a selection of sysex files which load correctly on Dexed but not on Minidexed and upload it here so you can examine them.

probonopd commented 1 year ago

I haven't tried changing banks with MIDI commands, but, generally speaking, I'd like to do all the voice selection directly on the MiniDexed rather than relying on external hardware.

Yes, I understand that. For debugging it is still important to know whether it works that way. Then we know where to look for the problem.

Are the symptoms similar to https://github.com/probonopd/MiniDexed/issues/83?

BobanSpasic commented 1 year ago

There are a lot of corrupted files in that collection:

MiniDexed will correctly load just plain DX7 VMEM SysEx dumps and nothing else.

MiniDexedCC can repair a lot of DX7 SysEx files, like the ones with the stripped headers, or the ones with additional (unneeded) headers. In the fact, I think that I've already implemented a repair functions for all the cases in that collection.

probonopd commented 1 year ago

Hi @BobanSpasic, is there a way to repair a .syx file using MiniDexedCC from the command line? This way, we could batch-fix the entire zip posted by @donluca.

BobanSpasic commented 1 year ago

Hi @BobanSpasic, is there a way to repair a .syx file using MiniDexedCC from the command line? This way, we could batch-fix the entire zip posted by @donluca.

I'll write a console tool for that. btw. not all the files could be "repaired" from that collection. There are single-voice (VCED, 163 bytes) files there - these need to be merged into a bank dump (32xVCED + conversion from VCED to VMEM = 1xVMEM). Any idea how to input 32 files to be merged as a command line input? Taking a text file with a file-list as an input?

probonopd commented 1 year ago

To get the ball rolling quickly, I guess the easy way would be to make a bank that contains only the converted patch plus 31 INIT patches.

donluca commented 1 year ago

@BobanSpasic oh man, I wish I had found your tool earlier...

I second @probonopd option to just convert to a bank with 31 empty patches, although I'd like to see in MiniDexed a change were empty patches are skipped entirely when scrolling through the various voices, to keep things clean.

EDIT: actually, I've noticed that there are not many of these, so we could definitely put them together. You can put 001->032 into one and 033->060 into another. Finally the others into another one.

donluca commented 1 year ago

@BobanSpasic I've created 3 text files with the list of the sysex file names in it for your batch file

sysex1.txt is full 32 patches sysex2.txt is 27 patches, but I've repeated the last one 5 times so that it is 32 sysex3.txt is 29 patches, but I've repeated the last one 3 times so that it is 32

Ideally instead of repeated patches we want empty ones but I guess we can manually do that after you've created the full sysex files, after all it's just 3 banks

sysex1.txt sysex2.txt sysex3.txt

I've also changed all the extensions to lowercase for better consistency and avoid possible issues due to case sensitivity, so here's the new updated zip with all the files.

sysex.zip

diyelectromusic commented 1 year ago

@diyelectromusic wdyt?

I've not been following this discussion closely enough to really see what is going on, but I agree with @donluca that this probably needs breaking into separate issues.

I want to get back to patch handling again myself and would like an update that only limits the number of banks to those loaded successfully. That is especially key as we allow more banks as a possibility. A user only wants to scroll between the banks they've installed, not scrolling past loads of empty slots.

In terms of how robust the handling should be - I don't think we should clutter the MiniDexed code with lots of possibilities for handling malformed or slightly off-spec files - that would be code running every time it tries to load a file, compared to someone tidying up files on the SD card, which they'd only have to do once! We could even curate a "known good set" ourselves I guess (or at least point out and link to one we're happy with).

But it should be clear that any it rejects do not become blank entries and get properly ignored I think.

Kevin

probonopd commented 1 year ago

Hi Kevin, with "@diyelectromusic wdyt?" I was referring to the idea of not loading all patches at bootup, but only when needed. I had outlined a possible approach for how to do that and am interested in what you think about it.

donluca commented 1 year ago

Fully agree with @diyelectromusic , let's use this Issue to discuss:

  1. Expanding the number of loadable Sysex (which has partially been done already)
  2. Managing off-spec sysex files
  3. Curating a library of sysex files
  4. Possibility of loading banks manually to reduce boot times
  5. Ignoring empty patches on the UI while scrolling through Voices

About 3: the link I've shared in the beginning points to what I personally believe is one of the best and most comprehensive Sysex collection I've seen over the web. There's little to no redundancy but it is not flawless, as already pointed out by Boban, but I believe that once we've cleaned it up it could be a very nice resource.

About 2: I agree with Kevin, the loading process, especially when you have 3000+ sysex files, is already pretty slow, let's not burden it with extra checks and let's just focus on 3 so that Minidexed users will have access to a proper, nice comprehensive library.

About 1: We just need to add the possibility in the UI to access beyond 127 Banks.

About 4: we can leave everything as it is and add a new folder in sysex/voices named "manual": the patches inside this folder can be loaded manually via the UI.

Finally, one last thought: what about discarding the idea of loading patches in RAM at boot time altogether and instead use a "lazy loading" approach? Banks take roughly ~300ms to load, it's not that bad. They are loaded when you either scroll the voices or the banks. It would make the scrolling more jittery for sure, but at least we don't have to worry about manually loading sysex files at boot and possibly waiting for lots of time to load a big collection...

diyelectromusic commented 1 year ago

Hi Kevin, with "@diyelectromusic wdyt?" I was referring to the idea of not loading all patches at bootup, but only when needed. I had outlined a possible approach for how to do that and am interested in what you think about it.

Ok. I'll try to spend a bit of time getting to see how the sysex loading works - its not something I've really taken a look at yet. But things are a bit busy this week, so it might not be for a few days...

Kevin

diyelectromusic commented 1 year ago

By the way, it looks like we could fairly easily load in the headerless voice banks by checking the file size in CSysExFileLoader::Load and if 4104 then do the header/end byte checks, but if 4096 just load the directly into VoiceBank[bank]->Voice.

ie. something like fread (m_pVoiceBank[nBank]->Voice, VoicesPerBank*SizePackedVoice, 1, pFile).

I don't think later code relies on the header values, but we should probably set them either all to 0 or add in the missing headers based on what the existing code checks for.

Of course, this removes any checking that might exist, but if there is no header to check, then there is no header to check :)

Kevin

diyelectromusic commented 1 year ago

While Minidex says that a sysex file has been correctly loaded, some of those are shown with no name in the menu and contain only empty voices, which means that they have not been correctly loaded

Ok, I think the bank ID detection is a little weird. From what I can see the following is currently happening:

  1. There are blank entries by default for all MaxVoiceBankID banks, so these will have blank names and zero voice parameters.
  2. Every time a bank is loaded m_nNumLoadedBanks is incremented. However, this is only useful if the banks have consecutive numbers in their filenames (see next point).
  3. Banks are not loaded into the VoiceBank based on the order in which they are found, they are loaded into the "slot" for the ID read from the filename format: 00000_name.syx

This means, I think - it would be nice is someone could confirm - that if your banks are in files labelled as follows: 00000_name1.syx 00001_name2.syx 00003_name3.syx

Then only entries for banks 0,1,3 will have names, and slot 2 will be blank, but m_nNumLoadedBanks will be 3. This means that in the UI, when selecting banks and auto-wrapping around, it will only be choosing between 0, 1, and the blank 2 before wrapping around back to 0....

We need to decide if the mapping of stored banks to loaded banks is absolute - i.e. all MaxVoiceBankID have to be considered and which are actually present depends on the filename numbers. In this case having non-consecutive filename numbers is possible, but a little odd.

Or if the mapping is relative and the filename IDs are largely irrelevant, but banks are loaded into consecutive slots based on the order that they are read from the filing system.

I don't think this is a problem I introduced when implementing the auto bank wrap around - I've just introduced the issue that some banks aren't counted if there are gaps in the filename numbering. I think non-consecutive filenames would always have left gaps...

Kevin

BobanSpasic commented 1 year ago

@diyelectromusic I took a look at the loading code a long time ago, and as I can recall - MiniDexed does not check the file checksum at all. It means - MiniDexed will also load corrupted banks (with probably weird results) as long as the file size is 4104 bytes. btw. same goes for Dexed

probonopd commented 1 year ago

Banks are not loaded into the VoiceBank based on the order in which they are found, they are loaded into the "slot" for the ID read from the filename format: 00000_name.syx

Yes, I think this is how it should be. But we should not load the banks at startup. Just determine the file with the highest 00000... number, and assume that many banks. Then, load the 00000... file whenever the user requests to load the bank with that ID. If, at the point when the bank is requested, it can't be loaded (isn't there, is corrupt,...) we just load a bank of INITs. Something along those lines.

donluca commented 1 year ago

I think there's actually a better way of handling this.

First thing, as @probonopd said, we must know how many banks are there, but that should be easy just by looking at the highest 000000_... number.

Then load the first 3 (for example) banks and any bank referenced by the default performance.

After that, whenever the user goes to the next bank, load in background the n+1 bank, where n is the number of currently loaded banks.

For example, let's say that Minidexed just started, the banks 000000, 000001 and 000002 are loaded by default.

The user scrolls through the voices in 000000 and arrives at bank 000001. As soon as he arrives at the first voice of 000001, in the background the n+1 bank is loaded which, in this case, means that 000003 is loaded, because the highest bank loaded is 000002.

And so on. When the user scrolls and goes to bank 000002, 000004 gets loaded, etc.

If the n+1 bank is higher than the last bank found at the first step, then just do nothing, it means that the user has scrolled through all the banks and has reached the end, which means all the banks are now loaded into RAM.

probonopd commented 1 year ago

Whether it worthwhile to add this complexity depends on how long it takes to load one bank. I could imagine the time for loading one bank is not so bad so we can do it "on demand"?

donluca commented 1 year ago

From my own tests I've seen that with an ancient Nokia 256MB SD it takes roughly ~300ms to load a bank, so not the end of the world, unless you' have to quickly scroll through hundreds of banks where it might become an annoyance.

That's why I proposed to add some sort of "buffer".

Another option would be to set the maximum of banks loaded at start at a certain number of banks on startup and then it keeps loading them in the background.

For the sake of simplicity, let's say that it takes 1 second to load 3 banks (worst case scenario, of course), and we don't want the user to wait longer than 10 seconds at boot, so we might fix this number at 32 banks on startup and the rest keep getting loaded in background.

If someone has a "modern", good SD card and can make some measurements on how long it takes to load a bank we can make some more realistic assumptions which will probably lead to better (and maybe easier) decisions.

BobanSpasic commented 1 year ago

Hi @BobanSpasic, is there a way to repair a .syx file using MiniDexedCC from the command line? This way, we could batch-fix the entire zip posted by @donluca.

https://github.com/BobanSpasic/MDX_Tool

As for now, it will repair just the files with 4096 and 4103 bytes. More will come, but after the Eastern Holidays.

donluca commented 1 year ago

Thank you!

Also, wow, Pascal! That's a language I haven't seen in a loooong time. Brings back many memories from when I was in high school. Meanwhile I've been cleaning up that collection, it had lots of duplicates and I'm still finding new ones. Also, ironically, it is lacking other more common known banks such as the Yamaha Voice ROMs and I think I'll also try to add the 4-op FM synths banks converted to 6-op, just for completeness sake.

@probonopd should I create a discussion about creating a curated library with sysex files already good to go for Minidexed?

donluca commented 1 year ago

I've merged #395 into my own branch and can confirm that it works correctly and I'm able to scroll banks beyond 127.

EDIT: I've also opened #456 to discuss the details of a "curated" sysex library for Minidexed, so we can move that conversation there. I'm onto it and I think I can manage it myself, I just need Boban's tool updated so that it can merge single voice sysexes and handle whatever other kind of corruption those sysex files have. If needed, I can also publish an article on my personal blog about it when I'm done to give it more visibility.

diyelectromusic commented 1 year ago

By the way, it looks like we could fairly easily load in the headerless voice banks by checking the file size

Ok, so checking the file size isn't quite as easy as I thought it should be, so instead, if the headered version fails to read in, and the config setting allows it, it will retry with the headerless version.

See #463 and see if you agree with this as an implementation (or would like to give it a go).

Kevin

donluca commented 1 year ago

I thought we all agreed that the most sensible thing would have been to just provide the user a "curated" library instead of doing extra checks which would slow down the initial sysex loading process...?

Despite this, I actually like your approach, especially the possibility to disable the check to have faster loading, so I'd say let's roll with it.

diyelectromusic commented 1 year ago

Yes, this isn't any extra checking as such, it is more allowing for the second most common (from what I see) type of SysEx file. But as it will be slower unless I can easily check the file size, I left it to the option of the user via a configuration setting.

In an ideal work it would validate voice banks somehow, especially those with a header that can be quickly sanity-checked, but that will slow things down a lot I would imagine. So due to this have even less checking the headered files, again that is another reason to leave it as an option.

Kevin

diyelectromusic commented 1 year ago

I think that now headerless files are supported (see #463 ), bank handling has been updated (see #461 ) and that work is now ongoing for a curated list of voice banks (see #456 ) this can probably now be closed off?

Kevin

sinitsinmike commented 6 hours ago

What is the expected syx file name format? 000000 then 3 _ then 3 letters? then .syx?

diyelectromusic commented 5 hours ago

Hopefully it is all explained here: https://github.com/probonopd/MiniDexed/wiki/Files#voice-data-syx-files

Kevin