archlinux / archweb

Arch Linux website code
https://archlinux.org
GNU General Public License v2.0
322 stars 129 forks source link

Support v2 and hybrid v1&v2 torrents #432

Open nl6720 opened 2 years ago

nl6720 commented 2 years ago

From what I can see, archweb doesn't support v2 and hybrid v1&v2 torrents in release.json. I'm getting a not a valid bencoded string when adding a base64 encoded hybrid or v2 torrent as torrent_data in releng/fixtures/release.json.

See https://blog.libtorrent.org/2020/09/bittorrent-v2/ for details on BitTorrent protocol v2.

The SHA256-based info hash v2 needs to be exposed in templates/releng/release_detail.html and as part of magnet links.

Torxed commented 11 months ago

Quick check, we should change: https://github.com/archlinux/archweb/blob/e864c90936157be5452e81521a717208761ca056/releng/models.py#L6

And use libtorrent instead, as it's more likely to be up-to-date. It also has libtorrent.bdecode() which does a much better job (and is V2 compliant). Only downside is that it generates bytes objects in both keys and values of the dict, so if that is an issue we need to use something like:

import datetime
import json

def jsonify(obj):
    """
    Converts objects into json.dumps() compatible nested dictionaries.
    """

    compatible_types = str, int, float, bool, bytes
    if isinstance(obj, dict):
        return {
            jsonify(key): jsonify(value)
            for key, value in obj.items()
            if isinstance(key, compatible_types)
        }
    if isinstance(obj, bytes):
        return obj.decode('UTF-8', errors='replace')
    if isinstance(obj, (datetime.datetime, datetime.date)):
        return obj.isoformat()
    if isinstance(obj, (list, set, tuple)):
        return [jsonify(item) for item in obj]

    return obj

class JSON(json.JSONEncoder, json.JSONDecoder):
    def encode(self, obj) -> str:
        return super().encode(jsonify(obj))
json.dumps(libtorrent.bdecode(libtorrent.bencode(torrent)), cls=JSON)

The above is a crude snippet, and it might only sove the not a valid bencode string.

jelly commented 11 months ago

So luckily we save the Release torrent_data as base64 encoded file. So that should allow a switch. Another bencode user is:

    def torrent(self):
        try:
            data = b64decode(self.torrent_data.encode('utf-8'))
        except (TypeError, binascii.Error):
            return None
        if not data:
            return None
        data = bdecode(data)
        # transform the data into a template-friendly dict
        info = data.get('info', {})
        metadata = {
            'comment': data.get('comment', None),
            'created_by': data.get('created by', None),
            'creation_date': None,
            'announce': data.get('announce', None),
            'file_name': info.get('name', None),
            'file_length': info.get('length', None),
            'piece_count': len(info.get('pieces', '')) / 20,
            'piece_length': info.get('piece length', None),
            'url_list': data.get('url-list', []),
            'info_hash': None,
        }
        if 'creation date' in data:
            metadata['creation_date'] = datetime.fromtimestamp(data['creation date'], tz=timezone.utc)
        if info:
            metadata['info_hash'] = hashlib.sha1(bencode(info)).hexdigest()