internetarchive / dweb-archivecontroller

GNU Affero General Public License v3.0
7 stars 2 forks source link

Torrent's announce-list not generated properly #17

Closed BubuAnabelas closed 3 years ago

BubuAnabelas commented 3 years ago

I tried downloading the dweb torrent files for this item via these URLs:

In both cases, the announce-list was:

"announce-list": [
  "http://bt1.archive.org:6969/announce",
  "http://bt2.archive.org:6969/announce",
  "wss://wt.archive.org:6969",
  "wss://tracker.btorrent.xyz",
  "wss://tracker.openwebtorrent.com",
  "wss://tracker.fastcast.nz"
]

The expected announce-list is:

"announce-list": [
  [
    "http://bt1.archive.org:6969/announce"
  ],
  [
    "http://bt2.archive.org:6969/announce"
  ],
  [
    "wss://wt.archive.org:6969"
  ],
  [
    "wss://tracker.btorrent.xyz"
  ],
  [
    "wss://tracker.openwebtorrent.com"
  ],
  [
    "wss://tracker.fastcast.nz"
  ]
]

When opened with qBittorrent v4.2.5 it only used the announce url ("announce": "http://bt1.archive.org:6969/announce") and when parsed with https://github.com/webtorrent/parse-torrent the result was:

announce: [
  '104', '116', '112', '58',  '47',
  '98',  '49',  '46',  '97',  '114',
  '99',  '105', '118', '101', '111',
  '103', '54',  '57',  '110', '117',
  '50',  '119', '115', '107', '120',
  '121', '122', '109', '102'
  ]

I belive the issue is arround these lines:

https://github.com/internetarchive/dweb-archivecontroller/blob/89868c6bda987071b6986dfd73fc362ed373f1c7/mungeTorrent.js#L36-L37

mitra42 commented 3 years ago

What I'm seeing is parse-torrent 'https://archive.org/download/commute/commute_archive.torrent' gets:

  "announce": [
    "http://bt1.archive.org:6969/announce",
    "http://bt2.archive.org:6969/announce"
  ],

These are known to be broken (http only).

With parse-torrent https://www-dweb-cors.dev.archive.org/download/78_house-of-the-rising-sun_josh-white-and-his-guitar_gbia0001628b/78_house-of-the-rising-sun_josh-white-and-his-guitar_gbia0001628b_archive.torrent I see what you do, i.e. the bad announce string of characters.

Will look into that code

mitra42 commented 3 years ago

@BubuAnabelas - it would help if you have a URL of a torrent that you think is correct.

BubuAnabelas commented 3 years ago

@mitra42 See for example WebTorrent's torrents (https://github.com/webtorrent/webtorrent.io/blob/master/static/torrents/sintel.torrent) which also have some WebTorrent trackers (websocket trackers).

When parsing https://archive.org/download/commute/commute_archive.torrent the resulting announce-list array looks like this:

"announce-list": [
  [
    "http://bt1.archive.org:6969/
  ],
  [
    "http://bt2.archive.org:6969/announce"
  ]
]

As a sidenote, all announce-list array's where generated using Bencode Online website which takes the bencoded file (torrent file) parses it and displays it as a JSON.

mitra42 commented 3 years ago

Right - I think that is part of the confusion, since announce-list and announce are taking different formats.

I made some fixes and parse-torrent 'https://www-dweb-cors.dev.archive.org/download/78_house-of-the-rising-sun_josh-white-and-his-guitar_gbia0001628b/78_house-of-the-rising-sun_josh-white-and-his-guitar_gbia0001628b_archive.torrent' now gets

  "announce": [
    "http://bt1.archive.org:6969/announce",
    "http://bt2.archive.org:6969/announce",
    "wss://wt.archive.org:6969",
    "wss://tracker.btorrent.xyz",
    "wss://tracker.openwebtorrent.com",
    "wss://tracker.fastcast.nz"
  ],

Funnily enough parse-torrent 'https://github.com/webtorrent/webtorrent.io/blob/master/static/torrents/sintel.torrent' actually fails :-(

mitra42 commented 3 years ago

parse-torrent https://dweb.me/btih/5ab236e31f381fa4ff7cd65c9eb3825d27a79fdd?output=torrent also works now (it uses www-dweb-cors)

BubuAnabelas commented 3 years ago

Funnily enough parse-torrent 'https://github.com/webtorrent/webtorrent.io/blob/master/static/torrents/sintel.torrent' actually fails :-(

I've just tested it with the torrent locally and it parses it without any problem. May I add that parse-torrent just grabs announce-list and spits it as the announce array after doing all the parsing so It's fine that it displays the array like that.

After doing some quick tests it looks like solved! Thank you very much @mitra42