anacrolix / confluence

Torrent client as a HTTP service
Mozilla Public License 2.0
237 stars 32 forks source link

Suggest to support http url of a torrent file #29

Closed chaos369 closed 2 years ago

chaos369 commented 3 years ago

like 'http://localhost:8080/info?torrent=https://abc.com/Sintel.torrent'

anacrolix commented 3 years ago

Could you provide more information why you think this is good. Why shouldn't the caller obtain the metainfo and send it to confluence, as currently happens?

chaos369 commented 3 years ago

I think it's better if every caller do not need to parse a torrent file, because it's the job of a torrent staff. Confluence is a service to handle the torrent things.

sgmihai commented 2 years ago

This is the first obvious thing that came to my mind when looking at the commands available. It should be able to take both an URL where to find the .torrent file, as well as a local path to it. First use case is for torrents that are not reachable via DHT with a metalink. Another would be to not have to wait until the DHT peers are found. As well as many others. This is an essential feature, please consider for next version. It should be trivial to implement.

anacrolix commented 2 years ago

I'm just not seeing how it's better than sending the metainfo to confluence yourself. If confluence performs arbitrary HTTP requests, that potentially leads to needing to expose a configuration for that for non-default requests. With https://github.com/anacrolix/confluence/tree/metainfo-post-no-query you can do curl "http://localhost:8080/info?ih=$(curl https://webtorrent.io/torrents/sintel.torrent | curl http://localhost:8080/metainfo -X POST --data-binary @-)". Let me know what you think.

sgmihai commented 2 years ago

I tested that branch, and it works, thanks. But there is a little downside to it, actually two.

  1. Only the infohash is passed, still requiring DHT to find peers for the torrent (which can take a few seconds, sometimes more). If the trackers would also be passed, then it would be speedier. Also, the metadata has to be downloaded from a peer, but this probably won't cause much delay normally.
  2. For the same reason, lack of access to tracker data, it won't work for private torrents, which do not announce on DHT. Of course this is problematic from another perspective, that most places that host private torrents require a client ID that is present on their whitelist.

For my own immediate personal needs, the solution above is satisfying, I guess. But I still think having the ability to grab the .torrent file from an URL or local path is ideal.

Let me know what you think.

anacrolix commented 2 years ago
  1. Only the infohash is passed, still requiring DHT to find peers for the torrent (which can take a few seconds, sometimes more). If the trackers would also be passed, then it would be speedier. Also, the metadata has to be downloaded from a peer, but this probably won't cause much delay normally.

I'm not sure I follow. By posting the metainfo to confluence, it loads and merges the contents of that with an active Torrent that it maintains, and saves a copy to its cache. That includes the trackers and info in the metainfo.

  1. For the same reason, lack of access to tracker data, it won't work for private torrents, which do not announce on DHT. Of course this is problematic from another perspective, that most places that host private torrents require a client ID that is present on their whitelist.

I think this is solved by 1 if my version is correct.

For my own immediate personal needs, the solution above is satisfying, I guess. But I still think having the ability to grab the .torrent file from an URL or local path is ideal.

  • Local path, because you might have the torrent locally and the original host might be down. And it would be cumbersome to set up a local http server just to be able to pass it to confluence OR manually get its infohash to pass.

Not sure I follow again: Just post your local metainfo copy to confluence if you have a local copy.

  • A bit speedier, because (if the torrent has any) the tracker announce is normally faster and metadata download from a peer is not needed
  • It can work with private torrents as well (if you get around the peer_id problem, not sure if you can alter that without changing the source code).

All this is related to 1 and 2 above.

I think maybe your goals aren't clear to me. Do you want confluence to extract the info for you from a metainfo, or do you want it to make use of the metadata in the metainfo that you have available?

sgmihai commented 2 years ago

What I mean is, this part of the command you suggested above: curl https://webtorrent.io/torrents/sintel.torrent | curl http://localhost:8080/metainfo -X POST --data-binary @- results in: 08ada5a7a6183aae1e09d831df6748d566095a10

So only the infohash is passed, resulting in all of the disadvantages I stated above, about DHT/no tracker. It's about speed, convenience, fail proofing (in case some low peer torrents don't have any DHT enabled peers), and the possibility of seamless automation/flexibility, which is essential for a tool like this, that is supposed to bridge torrents with non torrent enabled software/libraries etc.

anacrolix commented 2 years ago

That is the infohash passed to the /info endpoint, so that confluence returns the info for the correct torrent. The /info endpoint lets you either set or query the info for a torrent. It is distinct from the metainfo, see https://github.com/anacrolix/confluence#routes.

sgmihai commented 2 years ago

Not sure what to take from your last comment. That's what I said, no ? That just the infohash is passed to /info, not trackers and other torrent metadata.

anacrolix commented 2 years ago

I think the issue here is that info doesn't contain and will never contain trackers or other metadata. That is what metainfo is for (which incidentally also contains the info). See metainfo files, and info dictionary: https://www.bittorrent.org/beps/bep_0003.html.

anacrolix commented 2 years ago

I'm going to merge https://github.com/anacrolix/confluence/tree/metainfo-post-no-query and close this. Let me know if I have anything wrong!