ttlajus / lava_torrent

lava_torrent is a library for parsing/encoding/creating bencode and .torrent files.
Apache License 2.0
32 stars 6 forks source link

Model for other torrent related data structure #2

Closed msimonin closed 4 years ago

msimonin commented 4 years ago

Hi @ttlajus

I'm wondering if there's any plan to model other data structure of the torrent protocol like tracker response[1] or scraping information ? From your point of view, would you like to have them in your lib ?

ttlajus commented 4 years ago

@msimonin Thanks for asking.

In the case of tracker responses, if I understand it correctly, you can just send HTTP requests and parse the returned bencode, which can be done using the library in its current form (from_bytes()).

That said, something like TrackerResponse can provide extra convenience, so I don't mind adding it. Anything else you'd like to see?

The library would only handle parsing for now though. If you want the library to also do the network request/response for you, then we would be moving towards an actual torrenting library. While I'm happy to develop it in that direction, it's difficult for me to make the time commitment at the moment : )

msimonin commented 4 years ago

Thanks @ttlajus for your answer.

I was thinking mainly to add some convenient new structs (like TrackerResponse). Thanks for pointing me to the from_bytes(), I'll give it a try as a start.

ttlajus commented 4 years ago

@msimonin I pointed out from_bytes() simply to suggest that it's doable without a dedicated struct, but I'll add TrackerResponse just for the sake of user convenience.

ttlajus commented 4 years ago

@msimonin After attempting to add the support, I've decided to reverse my decision (at least for now), but I'll leave this issue open until there's a conclusive answer. Apologies.

My library can only handle tracker responses (/announce) but not tracker scrape responses (/scrape) at the moment. The reason is that the info hash strings in the scrape responses are actually raw bytes (BEP 48), but my library assumes that bencode strings are UTF-8 encoded.

Basically, BencodeElem::Dictionary now has to be HashMap<Vec<u8>, BencodeElem> instead of HashMap<String, BencodeElem>. Fixing this is certainly possible, but it's an annoying task that I'd rather not take on right now.

With scrape response support out of the game, I think few people will find tracker/announce response support alone to be useful. If you are interested in that, I can post my code here, and it should be plug and play.

Rant: Why did they have to use info hashes as keys? Why couldn't they use UTF-8 strings? Was saving 20 bytes per info hash that important in 2016? Ugh.


EDIT 1: I should clarify that the support for tracker/announce responses can only be partial, as peer ids are raw bytes too. Fortunately, the compact peer lists (BEP 23) don't contain peer ids, so my library can parse them just fine.

ttlajus commented 4 years ago

Hey @msimonin

I've finally had a chance to deal with this. Sorry for the delay!

It ended up to be simpler than I thought. However, I've only tested my code with the Ubuntu tracker, as many trackers only support UDP access. If you are still interested, do you think you can help with testing it on more trackers?

If you are willing to help, please use lava_torrent = { git = "https://github.com/ttlajus/lava_torrent" } in your Cargo.toml file, as I haven't released the changes yet.

ttlajus commented 4 years ago

I just tested with 2 more trackers, and my code seemed to work. Since OP isn't responding and tracker stuff is considered experimental, I went ahead and published v0.5.