Open TonalidadeHidrica opened 3 years ago
Do you have any real world data that contains OggS
? Any real world use case?
Generally the point of a magic is that it's, well, magic and the sequence that allows you to identify the start of a stream.
Is your proposal to silently ignore hash mismatches and continue the search? What if there is a legitimate hash mismatch then? I think it'd be better to error in that instance than to silently omit a packet and mess up higher layers :).
One of the most simple example is when the comment header contains a text "OggS". I agree that finding the pattern in audio packets is rare, probably almost never happen tho.
My proposal in my brain is to silently skip it. If the skipped packet was within a valid packet position but the page simply contains an error, then we can detect it for the next time we captured a valid packet, since we will see a jump in "page sequence no".
My concern is about seeking. When parsing pages sequentially, maybe we don't have to care about it so much. But when capturing a page right after seeking, because of the misdetection we may lose multiple pages covered by the first "false page". This may cause a problem when binary searching, and that's why I am worrying about it. Do you have any idea to handle them both? Or am I thinking too much?
One of the most simple example is when the comment header contains a text "OggS".
Hmm that's a good point. During normal operation it's not an issue I guess, mainly during seeking.
Do you have any idea to handle them both? Or am I thinking too much?
Nono, I think you are onto a good solution here :). It would be great if for the search of a page header you could specify whether you want pages with hash mismatches to be discarded or not, so that seeking ignores invalid hashes and normal operation when pages strictly follow each other would yield errors at a hash mismatch. I think such a change would be an improvement. Not sure if checking the sequence number is needed. PRs welcome!
Although extremely rare, the
OggS
magic pattern may occur in the places where a page does not start. CurrentUntilPageHeaderReader
does not check for the integrity it captured, so it may capture a wrong page span and then return falseHashMismatch
error.I have an idea of fixing this issue and currently working on it. It will require a large refactoring, and possibly cause breaking changes.