Closed richardlehane closed 2 years ago
This is still on my radar! (Although technically I've just reminded myself of it...)
Hi Ross, working on cutting a release with v100, this bug hits three more skeleton files: fmt-1502-container-signature-id-57000.psz fmt-1518-container-signature-id-22810.spp x-fmt-244-container-signature-id-4090.mpp
Thanks Richard.
We may have chatted about this before, apologies if so.
Taking one example:
<Files>
<File>
<Path>CompObj</Path>
<BinarySignatures>
<InternalSignatureCollection>
<InternalSignature ID="4090">
<ByteSequence Reference="BOFoffset">
<SubSequence Position="1" SubSeqMinOffset="40" SubSeqMaxOffset="1024">
<Sequence>14 00 00 00 'MSProject.DocFile.4' 00</Sequence>
</SubSequence>
</ByteSequence>
<ByteSequence Reference="BOFoffset">
<SubSequence Position="2" SubSeqMinOffset="40" SubSeqMaxOffset="1024">
<Sequence>16 00 00 00 'MSProject.Project.4_1' 00</Sequence>
</SubSequence>
</ByteSequence>
</InternalSignature>
</InternalSignatureCollection>
</BinarySignatures>
</File>
</Files>
Do you think it matters too much how the generator disambiguates those? Should it just try and fit the two sequences between bytes 40 and 1024?
Part of me wonders why this isn't an issue in how the signature is described (without seeking back to 40, sequence two must always appear after sequence 1, but within 1024 bytes?), but that seems not to be the case?
I don't think just fitting them both between bytes 40 and 1024 would necessarily work, since the second sequence needs to be at least 40 bytes from the first.
I think the rule for your example is:
Yup, that makes sense! Although, bear with me, do you know if the generator can handle the following workaround, and a secondary question, does it have the same intent as you describe?
(off the top of my head):
<Files>
<File>
<Path>CompObj</Path>
<BinarySignatures>
<InternalSignatureCollection>
<InternalSignature ID="4090">
<ByteSequence Reference="BOFoffset">
<SubSequence Position="1" SubSeqMinOffset="40" SubSeqMaxOffset="1024">
<Sequence>14 00 00 00 'MSProject.DocFile.4' 00 {40-996} 16 00 00 00 'MSProject.Project.4_1' 00</Sequence>
</SubSequence>
</ByteSequence>
</InternalSignature>
</InternalSignatureCollection>
</BinarySignatures>
</File>
</Files>
If so, it could be a temporary solution for v100. Better than that, that's one way I can pre-process the signature and put it back through the generator functions.
That looks like it should work to me. But rather than 996 that range could be the full 1024?
I've added you to the "builder" repo &you can test it there if you like (may save you setting up the full environment needed for building the suite). To test there just make those edits to the container file & add a tag e.g. v100-testing, then push those up and the github action should take care of the rest
I've added you to the "builder" repo
Thanks for that Richard - looking forward to giving it a go. I've put a few commits against my code tonight, but will need to take a bit more time to chip away with the fix itself. I'm happy there's already logging detecting the issue, so I think I can write something within that pretty easily. But let's see.
That looks like it should work to me. But rather than 996 that range could be the full 1024?
Most likely! I keep getting stuck with a visual of a 1024 byte frame in my head for both sequences to fit into.
It looks like a fix is coming together pretty quickly, I'm just going to neaten some things up and have a commit for review shortly.
I am having an issue with fmt-1190-container-signature-id-28200.swc
. @richardlehane can you take a look and see if it:
a) is okay in Siegfried, b) if not, whether the sequences look incorrect?
My reading is that it is being written okay. It's odd too that it's the only one not writing out correctly.
This commit should fix the issue, with the exception of .swc
all look good :crossed_fingers:
Will clean up some other small pieces of detritus and work to merge this.
I won't merge tonight, and I haven't had a chance to try builder but dev/issue-7-offset-overlap-issues
is where it's all at. I probably won't touch that again before merging. There's a tranche of work still needed to modernize this I'll do separately. Will rebase and merge in a few days depending on feedback and any further analysis I can do on `.swc'.
.swc looks like an issue with siefried's parser:
Do you see how the first sequence is anchored to "P" and the second to "B"? It should be the reverse, P stands for previous and B is BOF. Your file should match but it doesn't because sf is testing the offset range against the start of file rather than relative to the previous match.
Another notch on the belt of the skeleton suite!
I'm nearing a release and will make sure this gets fixed as it is a biggy. Let me know if there's anything you want included in the next release too.
this issue can be traced to Position="0"
which occurs 6 times in the container signature file
Thanks @richardlehane I appreciate you checking.
It looks like file has caught out DROID too which is where I was doubting myself.
I'll need to report an issue for them too.
this issue can be traced to Position="0" which occurs 6 times in the container signature file
This is happening because the majority of signatures don't use zero-based indexing? That would make sense that it has the potential to cause some difficulty.
Glad sf isn't alone in this, suggests the real problem is the container file itself and that the first byte sequence really should have Position="1" (as a comparison, Position="0" doesn't appear at all in the droid signature file).
This error may not be significant for all 6 occurrences but it matters for fmt/1190 because there are two bytesequences in this signature. The second bytesequence has Position="1", which is the real problem as there is an implicit ordering there that isn't being picked up because any bytesequences with "Position=1" are interpreted as being relative to the BOF/EOF rather than to a preceding bytesequence.
Okay, this should be fixed now, the actual fix is https://github.com/exponential-decay/skeleton-container-test-suite-generator/pull/14/commits/67c49cf192939b814ed3d9d7de1127afe1c57d0d and there are a bunch of other aesthetic and code-quality improvements captured in later commits. Let me know if there are any problems @richardlehane and I can look asap. Closing this one for now as it all seems good, but again, we can come back to it if need be.
These signatures both have multiple patterns with offsets
SubSeqMinOffset="0" SubSeqMaxOffset="1024"
The generator is incorrectly positioning sequences in the pattern. The first sequence in these patterns is put in the middle of the range (at offset 500 something). But the next sequence doesn't appear until about offset 2096 (i.e. adding the max offsets + length of the previous sequence). This is too far out (if the first sequence is at 500, should be at a max of about 1596 or something. offsets.zip