Nyagamon / HCADecoder

HCA Decoder
MIT License
117 stars 30 forks source link

Loop start/end position may be wrong (slightly off) #4

Open segfault-bilibili opened 2 years ago

segfault-bilibili commented 2 years ago

Currently it seems to be:

  1. loop start sample offset = loop start block index * 0x400
  2. loop end sample offset = loop end block index * 0x400

The loop header section seems to be interpreted as:

  1. loop start block index: big endian unsigned 32-bit integer
  2. loop end block index: big endian unsigned 32-bit integer
  3. loop cycle count (when it equals to 128 it means infinite): big endian unsigned 16-bit integer
  4. loop r01 (sorry, but I can't understand what "r01" means): big endian unsigned 16-bit integer

However, according to VGAudio:

https://github.com/Thealexbarney/VGAudio/blob/9d8f6ea04c83cccccb3dd7851a631bbd53a8dbbe/src/VGAudio/Codecs/CriHca/HcaInfo.cs#L35

        public int LoopStartSample => LoopStartFrame * 1024 + PreLoopSamples - InsertedSamples;
        public int LoopEndSample => (LoopEndFrame + 1) * 1024 - PostLoopSamples - InsertedSamples;

https://github.com/Thealexbarney/VGAudio/blob/9d8f6ea04c83cccccb3dd7851a631bbd53a8dbbe/src/VGAudio/Containers/Hca/HcaReader.cs#L189

        private static void ReadLoopChunk(BinaryReader reader, HcaStructure structure)
        {
            structure.Hca.Looping = true;
            structure.Hca.LoopStartFrame = reader.ReadInt32();
            structure.Hca.LoopEndFrame = reader.ReadInt32();
            structure.Hca.PreLoopSamples = reader.ReadInt16();
            structure.Hca.PostLoopSamples = reader.ReadInt16();
            structure.Hca.SampleCount = Math.Min(structure.Hca.SampleCount, structure.Hca.LoopEndSample);
        }
segfault-bilibili commented 2 years ago

(Sorry for mistyping something above. Now it should be corrected)

I'm not very sure which interpretation of the loop header is correct. Or maybe both make some sense?

segfault-bilibili commented 2 years ago

@Thealexbarney

Thealexbarney commented 2 years ago

Well, think about it a little. If your first two points are true than no HCA file could have a loop that's not a multiple of 0x400. This would be extremely restrictive and result in funky loop points if they couldn't be multiples of 0x400.

The structure of the loop block is

int LoopStartFrame:
int LoopEndFrame;
short PreLoopSamples;
short PostLoopSamples;

The pre-loop samples are the number of samples in the loop start frame that come before the loop point. The post-loop samples are the number of samples in the loop end frame that come after the loop point.

Or maybe both make some sense? Nah, that decoder's interpretation of those values in the structure is completely wrong.

My decoder/encoder should be completely correct and has been thoroughly tested against CRI's decoders/encoders that are available.

segfault-bilibili commented 2 years ago

If your first two points are true

Well, actually it's not "my" point - I didn't know HCA at all until I came across https://github.com/y2361547758/hca.js, which is TypeScript port of this project (https://github.com/Nyagamon/HCADecoder). Even for now I still have no idea how stock/official HCA decoder works (which should require reverse engineering).

no HCA file could have a loop that's not a multiple of 0x400. This would be extremely restrictive and result in funky loop points if they couldn't be multiples of 0x400.

That's actually exactly what I had thought of. However I have been unable to imagine where the more accurate loop start/end pointers could be put at, until I came across your project (https://github.com/Thealexbarney/VGAudio).

I think Nyagamon's interpretation of loop header may make some sense because every (although the number is very few) HCA (which is infinitely looped in game) I have examined seems to have loop.PreLoopSamples == 0x0080. Therefore I guess maybe it makes sense that 0x0080 probably means "loop count is infinite".

segfault-bilibili commented 2 years ago

By the way, I wonder it's signed or unsigned integers? It doesn't seem to make sense to use nagetive values here.

Thealexbarney commented 2 years ago

Even for now I still have no idea how stock/official HCA decoder works (which should require reverse engineering).

I've reverse engineered the HCA encoder/decoder. The implementation in VGAudio is functionally the same as the official one is, producing the exact same data output for both encoding and decoding, so it makes a good reference.

(Note: I replaced the IMDCT implementation CRI uses with a faster one. The only difference in the output from the current master VGAudio build will be due to tiny rounding differences. When using the IMDCT implementation CRI uses the outputs are identical.)

I think Nyagamon's interpretation of loop header may make some sense because every (although the number is very few) HCA (which is infinitely looped in game) I have examined seems to have loop.PreLoopSamples == 0x0080. Therefore I guess maybe it makes sense that 0x0080 probably means "loop count is infinite".

No, that's because of how the encoder works. The encoder inserts a subframe of audio at the start because decoding a subframe requires some of the data from the previous subframe. Then the encoder adds enough samples to align the loop start to the beginning of a frame so the minimum amount of processing is needed to seek to the loop point.

This results in the loop start being one subframe past the start of a frame since the decoder needs data from the previous subframe to decode the next one.

BTW, be sure to account for the InsertedSamples and AppendedSamples when doing everything. These are empty samples added to the beginning and end of the actual audio because encoding to HCA requires the number of samples to be a multiple of the frame size.

Thealexbarney commented 2 years ago

By the way, I wonder it's signed or unsigned integers? It doesn't seem to make sense to use nagetive values here.

They're signed, but it doesn't really matter since they won't get anywhere near the limit of the signed types.

segfault-bilibili commented 2 years ago

Thank you very much!

decoding a subframe requires some of the data from the previous subframe.

  1. Is a subframe 128-sample long (and, a frame consists of 8 subframes)? I once observed this in hex editor but I'm still not sure how long the "influence" would last.

  2. Is any successive (following) data also needed to decode one frame? I once heard that (I)MDCT has reference to both previous and successive data.


To be honest I know almost nothing about signal processing etc... I feel sorry if my noob questions occupied your time. However these two questions should be the last ones I want to ask.

Again, thanks a lot!

Thealexbarney commented 2 years ago
  1. Is a subframe 128-sample long (and, a frame consists of 8 subframes)?

Yes. This is true for all HCA files.

Side thought: Oops, I just noticed that naming inconsistency

public const int SubframesPerFrame = 8;
public const int SubFrameSamplesBits = 7;
  1. Is any successive (following) data also needed to decode one frame? I once heard that (I)MDCT has reference to both previous and successive data

Oversimplifying enough to answer your question, decoding subframe(SF) N requires only the encoded data from both SF N-1 and SF N. It doesn't need any data from any other SF, so it doesn't need data from either SF N-2 or SF N+1.

For example, decoding the audio in subframe 4 requires only the encoded data from both SF 3 and SF 4. It doesn't need any data from SF 2, SF 5 or any other SF.

This is why an extra subframe is inserted at the beginning during encoding and thrown out as garbage when decoding.