Open milahu opened 1 year ago
parse_torrent_file
will use Python's default dict, but it provides a use_ordered_dict
argument, to use collections.OrderedDict
. This parameter should be used in this scenario. It seems that this test is not rigorous. I will fix it when I have time.
And for now, you can just use the code in the test file(with use_ordered_dict=True
) to calculate info hash. For more context, see Issue #13.
Oh, I notice the sorted
you want to disscuess is not what I think of.
If you are talking about lexicographic order, the corrent implementation do not follow this. There is no mandatory requirement for the dictionary to be in order when parsing, and it will not actively perform sorting operations during encoding.
But this seems do not effect calculation of info hash, as long as the encoding step generated key order of info dict is the same as origin file bytes(by adding the use_ordered_dict=True
parameter).
There is no mandatory requirement for the dictionary to be in order when parsing
this would be nice to preserve the infohash
it provides a
use_ordered_dict
argument, to usecollections.OrderedDict
this is needed only for python2 and then torrent_parser should use OrderedDict automatically, to preserve the infohash
if sys.version_info[0] == 2:
from collections import OrderedDict
result = dict()
if sys.version_info[0] == 2:
result = OrderedDict()
in python3, dict
is an OrderedDict
>>> dict(b=2, a=1)
{'b': 2, 'a': 1}
>>> { "b": 2, "a": 1 }
{'b': 2, 'a': 1}
alternative solution: the parser could calculate the infohashes from raw source bytes of the info dict, and store the infohashes in attributes of the result data dict. calculating sha1 and sha256 digests should be cheap enough to make this default for parse_torrent_file
. internally, only the raw hashes are stored. the _hex
attributes return _raw.hex()
(in python3)
torrent = torrent_parser.parse_torrent_file("input.torrent")
if torrent.has_v1:
info_hash_v1_raw = torrent.info_hash_v1_raw # -> bytes
info_hash_v1_hex = torrent.info_hash_v1_hex # -> string
if torrent.has_v2:
info_hash_v2_raw = torrent.info_hash_v2_raw # -> bytes
info_hash_v2_hex = torrent.info_hash_v2_hex # -> string
alternatively, we could store the source locations of the info dict, and the user has to read the file again and calculate the digest manually. but IMO, the infohash is always useful when dealing with torrent files
currently the v1 hash appears only in this test
https://github.com/7sDream/torrent_parser/blob/23b9e110beb5b91c5498b286bd9d8cce83cfc076/tests/test_info_hash.py#L15-L20
expected:
get_info_hash_v2
simply useshashlib.sha256
instead ofhashlib.sha1
binascii.hexlify
can be avoided by usinghexdigest
related: https://stackoverflow.com/questions/46025771/python3-calculating-torrent-hash
stupid question: does
parse_torrent_file
preserve the sort order of the info dict? since python3, dict should be an ordered dict by defaulthttps://stackoverflow.com/questions/19749085/calculating-the-info-hash-of-a-torrent-file