Closed 955b8acf-25ae-499f-be8e-a029f22a3534 closed 14 years ago
http://svn.python.org/view/python/branches/py3k/Lib/sndhdr.py?r1=56957&r2=56987 seems to have broken sndhdr as it incorrectly detects files as voc (all that time ago and nobody has noticed...):
[jbit@miku]~$ xxd test.wav | head -n 1
0000000: 5249 4646 2478 e001 5741 5645 666d 7420 RIFF$x..WAVEfmt
[jbit@miku]~$ python3.1
>>> import sndhdr
>>> sndhdr.what("test.wav")
('voc', 0, 1, -1, 8)
I'm a little unsure why it got changed from indexing to startswith() at all, but changes like the following are just totally incorrect.
Attached is a patch against the py3k branch which fixes the issue as well as some other things obviously broken by the same changelist, although it might be a good idea to review the changelist again to make sure there's nothing else obviously broken :)
>>> sndhdr.what("test.wav")
('wav', 48000, 2, -1, 16)
Since email uses this and it doesn't look like anyone else is likely to have much knowledge of this module, I'm going to assign it to myself so I don't forget about it.
This module has no unit tests. It would be really great if you could create some to at least test for this regression. I'll be glad to provide guidance on the test suite if you need it, but I don't know all that much about audio files so I'm not in a position to write the tests myself at the moment.
The first changeset is just the translation to bytes, I don't see any issues there. I'm adding Victor as nosy since he wrote the second changeset, but the fix looks straightforwardly correct to me.
By the way, if you had written r56957 and r56987, roundup would have turned them into links, instead of quoting the link you did supply so that it wasn't usable :)
Oh, and the reason it got changed to startswith is probably a micro-optimization: startswith is more efficient than doing a slice-and-compare.
Thanks for the update... The link was actually just a "diff to previous" of the changelist that caused the problem (r56987), sorry for the confusion. :)
Attached is a quick and dirty unittest complete with some test files. It only tests the format of the file though, not any of the additional information returned.
With an untouched sndhdr.py it finds the issue reported, and with sndhdr-fix.patch applied it reports only one issue:
Traceback (most recent call last):
File "test_sndhdr.py", line 22, in test_aiff
self.assertEqual(ret[0], 'aiff')
AssertionError: b'aiff' != 'aiff'
I'm thinking that on line 65 of sndhdr.py the b'aiff' should be just 'aiff' (like everything else). At least I can't see a reason for it to be different. (Changing this fixes the unittest issue)
Cheers
Oh, and the reason it got changed to startswith is probably a micro-optimization: startswith is more efficient than doing a slice-and-compare.
Not really. I changed it because it's more readable (it's easier to understand that the code).
--
James: I commited all your fixes + your test suite to 3.1 (r82859 + r82860) and 3.2 (r82856 + r82857) and added your name to Misc/ACKS. Thanks for your contribution.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/bitdancer' closed_at =
created_at =
labels = ['type-bug', 'library']
title = 'sndhdr erroneously detects files as vox'
updated_at =
user = 'https://bugs.python.org/jbit'
```
bugs.python.org fields:
```python
activity =
actor = 'vstinner'
assignee = 'r.david.murray'
closed = True
closed_date =
closer = 'vstinner'
components = ['Library (Lib)']
creation =
creator = 'jbit'
dependencies = []
files = ['17981', '17986']
hgrepos = []
issue_num = 9243
keywords = ['patch']
message_count = 5.0
messages = ['110172', '110198', '110199', '110206', '110243']
nosy_count = 3.0
nosy_names = ['vstinner', 'r.david.murray', 'jbit']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'test needed'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue9243'
versions = ['Python 3.1', 'Python 3.2']
```