Closed timobrembeck closed 1 year ago
Did you try with the .mp4 as an empty file? Does it change anything to testing?
Did you try with the .mp4 as an empty file? Does it change anything to testing?
Yes, then the parsing works just fine and it does not throw the exception, but only shows 200 OK, broken external hash anchor
, because it thinks the document is empty and the anchor target is missing...
I wonder if we may check the response content type and not try to run _check_anchor on non-text content. We can still keep the new exception catching you introduced (as content-type can be faked), but that would avoid most of the time trying to parse big binary content.
Did you try with the .mp4 as an empty file? Does it change anything to testing?
Yes, then the parsing works just fine and it does not throw the exception, but only shows
200 OK, broken external hash anchor
, because it thinks the document is empty and the anchor target is missing...
Then maybe a minimal binary file, even not looking like a real mp4?
I wonder if we may check the response content type and not try to run _check_anchor on non-text content. We can still keep the new exception catching you introduced (as content-type can be faked), but that would avoid most of the time trying to parse big binary content.
Yeah, good idea, I will adapt the code. :+1:
Then maybe a minimal binary file, even not looking like a real mp4?
I tried to generate different random binary files of varying sizes, with no luck. So it seems to be something format-specific about some mp4 files. Would have to dig deeper into the traceback to understand which byte sequence exactly causes the error, but don't think it's worth the effort though. If you want, I can remove the test case again.
Then maybe a minimal binary file, even not looking like a real mp4?
I tried to generate different random binary files of varying sizes, with no luck. So it seems to be something format-specific about some mp4 files. Would have to dig deeper into the traceback to understand which byte sequence exactly causes the error, but don't think it's worth the effort though. If you want, I can remove the test case again.
Found that starting the binary string with <![
(e.g. something like b'<![x02\x00\xa0\xcc'
) is reproducing the crash. I'd rather use that than a 3Mb file.
Found that starting the binary string with
<![
(e.g. something likeb'<![x02\x00\xa0\xcc'
) is reproducing the crash. I'd rather use that than a 3Mb file.
Awesome, thanks for figuring that out! :+1:
Fix
NotImplementedError
/AssertionError
when checking video links with hash anchorsI struggled to find/generate a smaller video file for this test case, I ended up with a small excerpt from this stock video, do you think it's already sufficient to not include this file in the MANIFEST.in, or should I explicitly exclude it? Or do you think the file should be shipped with the package so the tests are complete?
Fixes: #150