Open spessasus opened 1 month ago
You have some great ideas.
Thanks, @stgiga ! There are a few things I thought of but missed when writing above though:
Maybe even somehow program in sysEx messages? (like if manufacturer ID is this and nth byte is this and that, then take byte X as the value). It is probably a bit too complex though (or is it? ;-)
more transforms:
changing type
Sample interpolation types. I can think of 4 now:
Loop types, it would be a flag that goes after loopCount
:
CHRL
- chorus list chunkchorusType
and it would be the index in the CHRL
chunk)Funnily enough, having those extra fields can improve Roland SC-88Pro and SC-8850/8820/SC-D70 Insertion EFX (CC94), as well as open up whatever fancy stuff MU2000EX XG3 does. So I'm OK with implementing this, and I've gotten the other team member involved.
Funnily enough, having those extra fields can improve Roland SC-88Pro and SC-8850/8820/SC-D70 Insertion EFX (CC94), as well as open up whatever fancy stuff MU2000EX XG3 does. So I'm OK with implementing this, and I've gotten the other team member involved.
Well, that's what I was thinking of, specifically the 88Pro. I actually started my MIDI and soundfont adventure by discovering MIDIs made for 88Pro. So recreating these perfectly would be amazing!
My first major SoundFont was an 88Pro SoundFont, and these days Tyroland (and some others, but on back burner because college) has good SC-8850 support created from the Tyros4+JV1010 samples in a long process.
The project lead got into SC-88x support because of my efforts (which have roots as far back as 2015), because I hated seeing broken songs. So I think even the project lead would be interested.
@spessasus Thank you for the draft for the SFe rework. Sorry for taking so long to write my response.
Naturally, because you came up with a complete re-structuring of the SFe specification, I've got a lot of feedback to give you.
You do have some good ideas, and I'll be working on version 4.00.6 of the specification, which splits the versions of SFe apart. Please check the branches of the repository for a preview. If you want to be part of the SFe team, then please tell me!
Forgot to add this. The many features that you mention that are incompatible with SF2.04 such as TSC, DWORDs and RIFF64, are mostly absent in the SFe32 specification.
Please take a look at the SFe32-only branch of the repository for the compatible version of the SFe specification.
Also, if you have any suggestions about where we can place the TSC implementation info so we can make it clear that using TSC will make your file incompatible with some players? Or should we scrap TSC all together?
We'll probably remove TSC implementation info temporarily for 4.00.6. We'll also plan to bring it back at a future date, maybe as part of SFe64L.
I DO NOT want TSC scrapped.
What's the point of TSC?
Just use 64-bit riff chunks like I suggested and then you have practically no size constraints.
Since TSC is already fundamentally incompatible with sf2, why not just skip to 64?
What's the point of TSC?
Just use 64-bit riff chunks like I suggested and then you have practically no size constraints.
Since TSC is already fundamentally incompatible with sf2, why not just skip to 64?
I mean, it was to make 32bit at least not be a complete rewrite for more samples.
Alright, there seems to be some disagreement about the TSC mode. Let me try to understand what's going on:
Right now, I'm going to propose this compromise:
Stuff to do:
Alright, let's sum my thoughts up:
TSC is putting SDTA at the end of the file. This is fundamentally incompatible with SF2 and therefore with all Sound Blasters and softsynths. So, a rewrite. Not to mention that there's no point in TSC with SB as they can load up to 30MB (or something like that) soundfonts. So the only advantage it brings (larger SDTA limit) is nullified.
RIFF64 is making some chunks use 64-bit length rather than 32-bit and changing the 2 last letters of fourCC to 64
(in ASCII). This is also fundamentally incompatible, so a rewrite.
Now, here's what a "rewrite" means for SpessaSynth for both of these:
64
and then reading length as 8 bytes instead of 4. This is a very minor change that I can implement in 2 lines of code. In fact, here is the new code:
export function readRIFFChunk(dataArray, readData = true, forceShift = false)
{
let header = readBytesAsString(dataArray, 4);
let sizeByteLength = header.substring(2) === "64" ? 8 : 4; // this line is new
let size = readLittleEndian(dataArray, sizeByteLength); // this line is changed
let chunkData = undefined;
if (readData)
{
chunkData = new IndexedByteArray(dataArray.buffer.slice(dataArray.currentIndex, dataArray.currentIndex + size));
}
if (readData || forceShift)
{
dataArray.currentIndex += size;
}
if (size % 2 !== 0)
{
if (dataArray[dataArray.currentIndex] === 0)
{
dataArray.currentIndex++;
}
}
return new RiffChunk(header, size, chunkData);
}
As you can see, TSC is very much a major rewrite, unlike the 64-bit extension (at least in SpessaSynth's case). And not to mention, come on. All modern computers are 64-bit. The only major exception would be the hardware soundblasters which don't take advantage of TSC anyways.
And even then, the 64-bit mode could work for 32 bit machines. The spec could add the following:
For 32-bit software, 64-bit chunks have to be checked. Any of the top four bytes is not set to 0, this means that the chunk size is above 32-bit maximum and the file should be rejected as incompatible with the architecture. If the top four bytes are all set to zero, the chunk must be read and played normally.
So, you've shown that it is a lot easier for a software developer to adapt to 64-bit than to use TSC mode.
Therefore, we're going to be moving TSC mode to its own specification separate from SFe. It will require the player to support SFe, but SFe won't require it. Version 1.0 of this specification should be released alongside of SFe 4.00 or 4.01.
To facilitate this method of developing features, we are working on giving the ISFe chunk its first use, as a "feature flag" system. The specific flag used for TSC will be well-defined, but the information on TSC will be absent from SFe specifications until we've sorted out where TSC mode should go.
So, you've shown that it is a lot easier for a software developer to adapt to 64-bit than to use TSC mode.
Therefore, we're going to be moving TSC mode to its own specification separate from SFe. It will require the player to support SFe, but SFe won't require it. Version 1.0 of this specification should be released alongside of SFe 4.00 or 4.01.
To facilitate this method of developing features, we are working on giving the ISFe chunk its first use, as a "feature flag" system. The specific flag used for TSC will be well-defined, but the information on TSC will be absent from SFe specifications until we've sorted out where TSC mode should go.
For now, TSC mode has been moved to a separate specification independent of SFe. However, this may change in the future. Sorry for posting the same reply twice!
Since SFe isn't really backwards compatible (see RIFF64, TSC or changing words to dwords, I propose to revise the file structure to remove all limits. It is heavily inspired by the DLS file structure.
My proposal
TOC
This proposal is intended to create a proper SoundFont v3 file format that can be (hopefully) infinitely expanded.
ALL UNKNOWN CHUNKS HAVE TO BE IGNORED AND PRESERVED AS READ
file extension
.sft
- short for "SoundFonT"Definitions
The INFO chunk
There's a defined INFO chunk that can be added to almost everything. It is described below.
All chunks here are 32 bits (no need for more than 4GB of text, come on)
All text chunks must use the utf-8 encoding!
All chunks within the INFO chunk are strings unless explicitly stated otherwise.
LIST:INFO
INAM
- Required. Name of the object that this list is in.ICOP
- Optional. Copyright of the object.ICMT
- Optional. Comment/description of the object.IENG
- Optional. Creator of this object.ICRD
- Optional. Date formatted as ISO-8601 string to ensure that software can read it as a date regardless of locale.ISFT
- Optional. The software used to create or edit this object.LIST:PROP
multiple - Optional. A custom property defining something (like "MIDI system": "Yamaha XG"). There can be multiple of these chunks in the info list.PNAM
- Property name as a string.PVAL
- Property value as a string.This chunk is denoted as
INFOLIST
in the format below, optionally listing additional chunks for this specific field.Curve
SFZ defines the concept of curves. Let's add them here!
The Curve is defined as an X number of points. The First one has to be 0, the last one has to be 1. The points are evenly spaced apart and interpolated LINEARLY.
Curve chunk
Here's a curve chunk
LI64:CURV
- a 64-bit RIFF chunkPT64
- point data, a 64-bit RIFF chunkpointsAmount
- an unsigned 32-bit integer describing the number of points in the curve.[points]
- a sequence of FLOAT numbers one after another, describing the points.[points]
must take exactlypointsAmount * 4
bytes.INFOLIST
- the optional INFO list to describe this curve (for example, by naming itGS Volume Curve
)for example, for a basic linear curve, you would do
#1
= 0.0#2
= 1.0Modulator Index list
A simple list describing the used modulators. Used in presets, instruments, and the default mod list.
MODS
modsAmount
- an unsigned 32-bit integer describing the number of modulators in the list. The values below are NOT RIFF chunks, just numbers one after another[indexes]
- a sequence of 32-bit unsigned integers one after another, pointing to an index within theMODL
list in the mainINFO
chunk.[indexes]
must take exactlymodsAmount * 4
bytes.Sample
Here's a chunk that describes a single sample.
LIST:SAMP
- a single sample.INFOLIST
- the INFO list describing the sample.MKEY
- the original MIDI key number of the sample. a 16-bit unsigned number.GAIN
- the gain adjustment to the sample. A FLOAT number.TUNE
- the tuning for the sample, expressed in cents. a 32-bit signed integer.SDIX
- sample index. This chunk is 16 bytes long. The values below are NOT RIFF chunks, just numbers one after anotherindexStart
- The first eight bytes form the 64-bit unsigned start index in bytes starting from the first byte of the sample data chunk.indexEnd
- The last eight bytes form the 64-bit unsigned number of the sample end index (the last byte of the byte-stream) starting from the first byte of the sample data chunk.TYPE
- a 16-bit unsigned integer describing the audio type. See audio typesLIST:LOOP
- the loops for this sample. OVERLAPPING LOOPS ARE NOT ALLOWED!LOOP
multiple - a single loop. The values below are NOT RIFF chunks, just numbers one after anotherloopCount
- a signed 16-bit integer describing the loop count of this loop. 0 means infinite loops and -1 means "exit loop on key release"loopStart
- an unsigned 64-bit integer describing the loop start index IN SAMPLES relative to first sample inclusive.loopEnd
- an unsigned 64-bit integer describing the loop end index IN SAMPLES relative to first sample, exclusive.Audio types
The
TYPE
chunk within a sample chunk describes the type of bitstream the sample uses.Currently, I can think of these types:
More can be added later (up to 32k!)
Up to two channels of audio are allowed. They shall be interpreted as effectively two samples with exactly the same generators, except for pan which is forced to -1 on the left sample and 1 on the right.
Generator list
Here's a generator list.
GENL
- generator list. The values below are NOT RIFF chunks, just numbers one after anothergeneratorsAmount
- a 32-bit unsigned number of generators in this chunk.[generators]
- a bit stream of generators. A generator is defined as a 6-byte stream.type
- a signed 16-bit generator type. The generator types are to be definedvalue
- a FLOAT number with the generator value. Four bytes.Another thing, since the value is a float now, I propose changing timecents to seconds (or miliseconds?). It would make things A LOT easier. Also making pan be -1 to 1.
Modulator
The modulator is something I'm unsure about, but it has to implement the following:
Instrument
Here's a chunk that describes a single instrument
LIST:INST
- the instrumentINFOLIST
- the info list describing this instrument.FLAG
- a two-byte long chunk, forming a 16-bit unsigned flag number. Currently undefined.LIST:ZONE
- the zone LIST.KEYR
- the MIDI key range as a 16-bit number. the first byte is the minimum MIDI key, and the second byte is the maximum. BOTH INCLUSIVE.GENL
- the generator listMODS
- the modulator index listINFOLIST
- optional - the info about this zone.SIDX
- optional - sample index within the sample list. a 32-bit unsigned number. The lack of this chunk makes this a global zone.Preset
Here's a chunk that describes a single preset
LIST:PRES
- the presetINFOLIST
- the info list describing this preset.BANK
- a four-byte long chunk. The First two bytes form a 16-bit unsigned bank MSB number and the second two form a 16-bit unsigned bank LSB number.FLAG
- a two-byte long chunk, forming a 16-bit unsigned flag number. Zero means nothing, 1 means a "drum preset." Other flags can be added later.LIST:ZONE
- the zone LIST.KEYR
- the MIDI key range as a 16-bit number. the first byte is the minimum MIDI key, and the second byte is the maximum. BOTH INCLUSIVE.GENL
- the generator listMODS
- the modulator index listINFOLIST
- optional - the info about this zone.IIDX
- optional - instrument index within the instrument list. a 32-bit unsigned number. The lack of this chunk makes this a global zone.File format
RF64:snft
- main RIFF chunk (64 bits)INFOLIST
- The INFO LIST. This one has the additional chunks below:LIST:MODL
- The global modulator list for the SoundFont. All modulators are described hereSMOD
multiple - A single SoundFont Modulator. The structure is to be decided, but I'm thinking of the DLS articulator structure. Also with an optional INFO list.LI64:CRVS
- the defined curves list.LI64:CURV
- a single curve. Defined in Curve ChunkDMOD
- default modulators, applied to all instruments.MODS
- the Modulator index list.LI64:SFDT
- the soundfont data (64 bits)LI64:PRSL
- preset list (64 bits)LIST:PRES
multiple - a single presetLI64:ISTL
- instrument list (64 bits)LIST:INST
multiple - a single instrumentLI64:SMPL
- the sample list (64 bits)LIST:SAMP
multiple - a single sample.SD64
- the SAMPLE DATASample data
THe
SD64
contains all the sample data. All the data is a bunch of bitstreams one after another.For example, if the first sample was WAVE and a second one was FLAC, the bitstream would start with the WAVE file and just after that there would be the FLAC file pasted in.
Conclusion
What do you think of this format? Is it any good?? Let me know!