derselbst / ANMP

multi-channel loopable video game music player for nerds and audiophiles
GNU General Public License v2.0
31 stars 3 forks source link

Cool project :) #68

Open mmontag opened 4 years ago

mmontag commented 4 years ago

You can go ahead and close this issue, just wanted to say it looks like a super cool project. I agree that the file system is a perfectly useable database :) (or at least we can get pretty far with it). I'm using the directory structure as database in my music player, but it runs on the web so I also build an index to make the search fast.

derselbst commented 4 years ago

Thank you very much :)

In fact, I also frequently listen to your chip-tune player. I like its low latency... and eps. the type of music it is capable to play.

mmontag commented 4 years ago

I'm curious about the approach here to MIDI files and N64. Trying to understand – is the Ultra64 compressed MIDI format completely separate from lazyusf2? Do you need special MIDI files with special SF2 ripped from the games? It seems like an interesting high-level approach.

derselbst commented 4 years ago

lazyUSF

In the very beginning when people were trying to find a way to retrieve the OST of an N64 game, hackers and geeks said: Hey, let's just take the game itself, strip away everything that is not needed for playing music and create save-dumps that play back all the single tunes when fed into the "stripped ROM". This is when (mini)USF was born. Today, USF files are available for all publicly released N64 (except some 64DD exotic ones). But, since USF contains the actual game (MIPS program), you need an emulator for it to make it play. Therefore, lazyusf and lazyusf2 are provided (the first based on Project64, the latter based on mupen64plus).

MIDIs and soundfonts

However, the sound quality output of the game itself is... well... it has room for improvement. Many games are only rendering at 22kHz. Thus the same hackers and geeks began to reverse engineer the actual format the music was stored in. It turned out (little surprising) that many games used sequenced (=MIDI-like) audio, rather than streamed audio. This means that the quality could be improved if the sequence is synthesized outside the game at a higher samplerate. To get an original music recreation, you need the soundbank as well. The one and only tool for that is Subdrag's N64 Sound Tool.

Just use Subdrag's tool to extract the sequence of any N64 game and store it as standard MIDI files. Extracting the soundbank, tweaking the soundbank's ADSR envelopes in order to get an exact reproduction of the OST is the difficult part. I've already done it for my favorite Rareware games.

For ANMP, simply put the MIDIs and the SoundFont (sf2 or dls) into a folder, rename the soundfont to the name of the folder it's in and ANMP will automatically use that soundfont for synthesizing all MIDIs in that folder. If a single MIDI A.mid needs a specific soundfont, simply name that soundfont A.sf2. So that A.sf2 will take precedence of the folders soundfont.

Under the hood, ANMP uses fluidsynth as synthesizer. Since the volume response of the SF2 synth model does not match the one of the N64 software synth, I'm using fluidsynth API to manipulating the SF2 default modulators. As of today, this affects all MIDIs, also regular GM MIDI files, which may sound different (or worse) when applying the N64 volume response curve. Additionally to the volume response, I've made some custom modifications to how MIDI files are interpreted, see the wiki. Effects, like reverb, are also a hard part to reproduce correctly, but I've found a setup that works quite acceptable, at least for Rareware games.

Ultra64 compressed MIDI format

While the previous approach works quite well, it has a major unfixable limitation: MIDI only supports 16 channels. Banjo-Tooie, however, uses up to 32 channels in a single sequence. This is why I decided to natively support Nintendo's proprietary format for sequenced audio. The same format that they used in the N64 SDK, and the same they advised developers to use for their games back then.

Fortunately, all of Rareware's games make use of this "Ultra64 Compressed MIDI format". File extension .cmf. Except for Banjo-Tooie, where Rareware had to change the header of the CMF, but since there is no way to detect this programmatically, I just used .btmf as an extension for Banjo-Tooie sequenced format.

To obtain this "native" sequenced audio format, simply use Subdrag's tool again, this time exporting as ".bin". Note that there is no guarantee that this BIN file actually contains audio in the CMF format. In fact, many games developed at Nintendo HQ use custom formats (I'll leave it to Subdrags tool for coping with these formats). Apparently, Nintendo's developers had a lot of time (or fun) re-inventing the wheel again.

[... I think I'll put this in the wiki as well...]

mmontag commented 4 years ago

Thanks for this detailed explanation! This all sounds insanely interesting. So you are using fluidsynth as a "high level emulation" generic synth engine. Do you have any links I can check out, like rendered audio files or YouTube videos, demonstrating this kind of .sf2 resynthesis?

As it happens, I'm trying to break into N64/Playstation (PSF-based formats) on my player. That's how I came across your player. These formats are just a little tougher to work with through emscripten since they rely on the file system. But it's sequenced music and a huge catalog, so it should be worthwhile.

derselbst commented 4 years ago

Do you have any links I can check out, like rendered audio files or YouTube videos, demonstrating this kind of .sf2 resynthesis?

No videos. I've rendered some songs for you:

mmontag commented 4 years ago

Thanks for sharing those. They do sound great. I really wish this higher sample rate was possible inside the lazyusf2 player. Here is what kode54 has to say about that. Here is what JoshW has to say

I have lazyusf2 working in my player, but I am pretty bummed about the usflib format. For one thing, there is about 1-2 seconds of blocking code execution at the beginning of each song playback[1] - I assume to decompress samples or something like that (it's compiled to WASM, so I don't get dynamic recompilation or x86 SSE optimzations).

It's also not really possible to seek within the songs - since there's no sequence data, and every intervening instruction has to be processed, it might take 15 seconds to seek 20 seconds forward.

See an example of this in action here: http://www.wothke.ch/webn64/ (J. Wothke compiled a bunch of cool music players to WASM before I did.)

In short, I wish I could improve on lazyusf2 somehow...

[1]

image

mmontag commented 4 years ago

Been trawling the HCS forums. Found another one of yours: Diddy Kong Racing - Ancient Lake

derselbst commented 4 years ago

Thanks for sharing those. They do sound great.

Thank you, you're welcome.

I really wish this higher sample rate was possible inside the lazyusf2 player.

Yeah, I remember that comment from kode54. I must admit that I still don't fully understand the first approach. The second approach sounds straight forward: Just hack the game. Unfortunately it turns out to be more complicated: I had a look into the N64 SDK's game demo playseq (which does exactly that: playing a sequence of music). They are defining the sample rate as C macro. They pass it to an OS function osAiSetFrequency() and the return value of that function is used as "real" sampling rate for the soft synth. However, they also use that macro for calculating the length of delay lines for the reverb. This is macro logic, the compiler will perform constant folding here. So it's really hard to adjust this properly to match the samplerate... and even if you did, would it sound right or garbage? And you need to do it for every single game...

For one thing, there is about 1-2 seconds of blocking code execution at the beginning of each song playback

The only idea I have is to make sure that you enabled high level emulation (HLE) usf_set_hle_audio().

It's also not really possible to seek within the songs

Yes, unfortunately. That's one of the main reasons why ANMP decodes the entire song and then seeks within the PCM buffer.

Been trawling the HCS forums. Found another one of yours:

Yes, good spot!

mmontag commented 4 years ago

However, they also use that macro for calculating the length of delay lines for the reverb. This is macro logic, the compiler will perform constant folding here. So it's really hard to adjust this properly to match the samplerate.

Yeah, sounds painful.

Do you have MIDI+sf2 for Perfect Dark/Goldeneye 007?

Just came across this and wonder how it was created, because it sounds excellent: https://www.youtube.com/watch?v=K4IBiAUSZwk

derselbst commented 4 years ago

Do you have MIDI+sf2 for Perfect Dark/Goldeneye 007?

No. But MIDI and DLS can be obtained with Subdrags Sound Tool.

Just came across this and wonder how it was created, because it sounds excellent: https://www.youtube.com/watch?v=K4IBiAUSZwk

I'm not familiar with the PerfectDark soundtrack (never played it). So, I don't know how it sounds like originally and what exactly makes this one "remastered". The particular video uses an opus compressed audio with a samplerate of 48 kHz. When you download the tune and load it into e.g. sonic-visualiser and look at the spectrogram, you clearly see that the Nyquist frequency is (except for some short peaks due to the compression algorithm) at around 11kHz.

perfectdarkspectro

In constrast, when you look at the spectrogram of e.g. DKR Intro I shared earlier, you clearly see that the Nyquist frequency is around 19 kHz. Originally I've synthesized it at 48 kHz, but ofc. the samplerate of the individual samples is also limited, and opus does a good job on compression etc.

So coming back to your question how it was obtained: Most likely via lazyusf, with some postprocessing, e.g. a band-limited sinc interpolation to go up to 48kHz from the original 22kHz signal. And perhaps some equalizer to amplify the higher frequencies a bit for obtaining the "remastered" sound.

mmontag commented 4 years ago

Just realized you are the current maintainer of fluidsynth and we had a brief chat there about XG MIDI a while back. 😅

I might try compiling ANMP for macOS. It would be cool to someday incorporate your custom playback stuff into chip player JS, but not possible right now because I'm using fluidlite; it doesn't support DLS.

mmontag commented 2 years ago

@derselbst hi Tom, I had some questions about your MIDI/SF2 process, is there a way I can reach you privately? (Or you can answer here)

MIDI and DLS can be obtained with Subdrags Sound Tool.

Basic questions about the process:

derselbst commented 2 years ago

is there a way I can reach you privately

Sure, my email address is available from my Github profile, but I prefer public conversation, in case somebody else who is interested comes along.

How does one convert Subdrag DLS output to SF2 - is this a painstaking process?

Subdrags tool natively supports writing DLS files. It generally does a pretty good job. So, all the sample loops are accurate, the samples are fine tuned... however ADSR, tremolo and vibration usually suffer accuracy compared to the game. Since my knowledge about DLS is limited and I'm lacking an efficient way to edit DLS I usually convert them to SF2, so I can edit with Polyphone more conveniently. Other than that there's no reason. And since fluidsynth also supports loading DLS by now, you better just try listening to the DLS.

For the conversion itself I used Swami. Whether or not all that information survives the conversion process depends on the converter. At least technology-wise there is nothing that e.g. only exists in DLS but isn't supported by SF2 (except that DLS uses 32bit resolution for certain things, whereas SF2 only uses 16bit, which I don't care about because it's inaudible).

You mentioned tweaking SF2 ADSR envelopes, but do you have any more pointers or step-by-step process for doing this conversion?

N64 soundbanks specify 16bit values for attack, release, and maybe sustain and delay, IIRC. What these values mean depends on the game. Subdrags tool multiplies most (all?) of these values unconditionally by a factor of 3. Why? No clue. But it sounds more accurate than not doing so. I think that statement "tweaking envelopes" originated from the time when Subdrag didn't support writing DLS files. So I had to manually hack the tool to write SF2, which ofc was lacking all ADSR information. So you'd need to try and error, like: listen to that tune in USF, comparing it to the MIDI+SF2 version, figuring out which instrument had a strange ADSR, tremolo or vibrato and then manually messing around in the SF2 instrument until it sounds ok.

This topic is actually getting more difficult for certain games, particularly Dinosaur Planet and Star Fox Adventure, because here they are using certain MIDI CCs to act like SF2 modulators, which are presumably used to influence ADSR and other aspects. However, I didn't have time to reverse engineer this to figure out how they really work and influence the synthesized audio. (Barely working but probably completely wrong implementation here.) Probably it would be much easier to just ask David Wise, he'll surely remember what he did 20 years ago :)

Ofc, it would be better to completely automate this process. But that would require to reverse engineer the games a bit and understand how they react to different MIDI CCs or envelope values in the sound bank.

mmontag commented 2 years ago

Thanks!! Very useful information.

I have been chipping away at MIDI+SF2 issues using Goldeneye 007 as a basis. So far:

I am trying Viena and Polyphone, my first time seriously editing SoundFonts.

Do you know of any script for unrolling per-track loops in the MIDI files? This seems to me to be the most appropriate way to fix this issue (arguably, should be done during conversion by the Subdrag tool, since it's not part of the MIDI standard) and I don't want to bodge my MIDI player more than I already have.

(Same could also be said for the overlapping notes problem, I suppose?)

BTW, I was using BK.sf2 and DKR.sf2 from your data repo, and they seem to work great. I will be happy to share my SF2 results if I have some success.

Ofc, it would be better to completely automate this process. But that would require to reverse engineer the games a bit and understand how they react to different MIDI CCs or envelope values in the sound bank.

Yes, I believe L.Spiro might be the only person who has gone to this depth in N64 music reversing :)

derselbst commented 2 years ago

Orange notes get cut too early by the green note-offs...not sure how I'm going to solve for that.

The N64 SDK implemented it as queue: The first noteon spawns a voice, the next noteon spawns another voice, a noteoff releases the first voice, another noteoff releases the second voice. Jet Force Gemini uses this behavior for "stacking voices" resulting in pretty loud percussion instruments. That's why Subdrag's tool shouldn't mess around with overlapping notes.

Do you know of any script for unrolling per-track loops in the MIDI files? This seems to me to be the most appropriate way to fix this issue (arguably, should be done during conversion by the Subdrag tool, since it's not part of the MIDI standard) and I don't want to bodge my MIDI player more than I already have.

There is no such script to my knowledge. Most of these track loops have an infinite loop count. How many iterations would you want to unroll? This highly depends on how long the user wants to listen to the tune, which might not always be predictable.

Also, not all of those loops share the same end tick. David Wise did it that way in DKR, this allows you to synthesize just a few minutes of the tune and then set WAVE loop points to get seamless looping. In DK64 and JFG however, MIDI track loops of different lengths are used for sound effects (like for these crazy laughs in Creepy Castle Ballroom), causing the effects to play at different positions in the audio, depending on how much you have listened to.

So, I don't think it makes sense for Subdrags tool to unroll those loops. Rather, you have to bodge your MIDI player with that logic. I'm using fluidsynth's sequencer timer call back events to reschedule the MIDI track loop.

BTW, I was using BK.sf2 and DKR.sf2 from your data repo, and they seem to work great. I will be happy to share my SF2 results if I have some success.

Cool, yeah that would be great!

Yes, I believe L.Spiro might be the only person who has gone to this depth in N64 music reversing :)

Aha, didn't know about this guy before. I'll try to get in contact with him on Discord. Hoping he'll release his code to public soon.

Edit: it already is: https://github.com/L-Spiro/Nintendo-Synthy-4

mmontag commented 2 years ago

Well, I gave up and started hacking my MIDI player. Yes, we need to do this in the player if we ever hope to provide looping controls. I'm beginning to appreciate the complexity of your loop handling code; already encountered stuck notes when the loop restarts. Do you know any games that use CC 104 to set the number of loops? And is there such a horror as nested loops?

Thanks for the link to Nintendo-Synthy.

derselbst commented 2 years ago

Do you know any games that use CC 104 to set the number of loops?

I think I've seen that in Jet Force Gemini, sparse4 (the Jungle Main Theme).

And is there such a horror as nested loops?

IIRC, nested loops were documented in the Ultra 64 SDK. However, I can't remember to have seen them in any of the games I came across so far.

mmontag commented 2 years ago

Well, it looks like N64 Tools does attempt to unroll loops, at least in Goldeneye 007 music (brace yourself): CMidiParse::GEMidiToMidi

But it also translates loop start and end to CC 102 and 103. For whatever reason (maybe I got the settings wrong) some of the loops in the MIDI export are unrolled and some aren't. I can't figure out exactly what the looping options do:

CleanShot 2022-10-15 at 04 04 39

derselbst commented 2 years ago

Well, it looks like N64 Tools does attempt to unroll loops

Oh yeah, I remember. 4 years ago, I asked Subdrag to also export loop markers as CC 102 and 103. I think this loop-unrolling logic already existed back then. So he just unconditionally exported the markers... this could be improved. I never used the other looping options, since I was happy to implement proper loop handling in ANMP :)

mmontag commented 2 years ago

Okay, makes sense. But how do I handle the unrolled loops myself (especially with multiple loop markers)? For example, page 1 https://share.cleanshot.com/di4cnA page 2 https://share.cleanshot.com/VwKMMS this example appears to loop forever between tick 3072 and 6144, but sounds wrong when I do that.

derselbst commented 2 years ago

Uff, where are all those CC103=0 events coming from? Did Subdrag's tool unroll the loops? That doesn't look correct to me, could be a bug. You should try to get a version where no unrolling has taken place. Just CC102 and 103 should be in there. And if it really turns out to be a nested loop, I would expect a CC104 event in there as well. Having two infinite nested loops wouldn't make too much sense I guess.

mmontag commented 2 years ago

https://github.com/jombo23/N64-Tools/issues/49

mmontag commented 1 year ago

For anyone coming across the thread, I had some success getting a conversion process up and running, now with about a dozen N64 games that load up Soundfont banks:

https://mmontag.github.io/chip-player-js/browse/Nintendo%2064%20(SoundFont%20MIDI)

My workflow for each game: 1) extract all MIDI and DLS soundbanks (and debug text) with Subdrag's N64 SoundTool 2) convert DLS to SF2 3) clean up SF2 (manual testing, by ear) with Polyphone 4) grab the corresponding .inl file from L-Spiro's Nintendo-Synthy-4 project (e.g. NS4GoldenEye007Files.inl) 5) run a script that cleans up MIDI files 1) clean up track loop events based on debug text 1) pick up track titles/soundbank ID from the .inl file 1) for games with multiple soundbanks, insert the Soundfont filename into the MIDI file with a meta text event 1) fix drum channel 1) fix overlapping notes (they are truncated since I've not modified fluidsynth) 1) insert all notes off at end of loops

Essentially, it is a light version of what ANMP does. Maybe not as accurate, but same idea with slightly different goals. Hope you enjoy.

mmontag commented 1 year ago

I found something interesting. Way back in 2000, there was a spec for embedding DLS soundbanks with MIDI files in a RIFF wrapper. http://web.archive.org/web/20110610135604/http://www.midi.org/about-midi/rp29spec(rmid).pdf

And here is a collection - literally just 15 RMI files - on archive.org: https://archive.org/details/RIFF-MIDI-DLS I doubt there are many more than this in existence.

I don't like the spec because there is no facility for pointing to an external shared soundbank, but interesting nonetheless.