folz / bento

:bento: A fast, correct, pure-Elixir library for reading and writing Bencoded metainfo (.torrent) files.
Mozilla Public License 2.0
95 stars 14 forks source link

Update Bento.MetainfoError.transform to adapt to malformed torrent #2

Closed folz closed 1 year ago

folz commented 8 years ago

Hey @ericls, mind dropping a comment explaining the two changes you made? Happy to merge your changes into Bento, but was curious about the to_existing_atom -> to_atom change especially since you already define the atoms in the function, and the motivation behind the change in the first place. If there's a bug I missed, I'd love to hear about it!

ericls commented 8 years ago

The motivation of this is that I've found some torrents may have additional fields names, but they are still "usable" torrents.

Changing from to_existing_atom to to_atom is actually bad. Dynamically generating atoms is not a good practice, since atoms are not garbage collected and there's a limit on how much atoms you can have.

So I made another change to find the values of the fields represented by the pre-defined atoms from the map. This change also solves the problem where some torrents may have field names that are not even "atomizable". According to: http://elixir-lang.org/docs/stable/elixir/String.html#to_atom/1

Currently Elixir does not support the conversion of strings that contain Unicode codepoints greater than 0xFF.

(Actually, I found that most if not all codepoints greater than 128 are not "atomizable")

I have to admit that I'm pretty new to elixir and in general new to functional programming. Please point it out if there's something wrong with my code and comment.

folz commented 8 years ago

Ah, makes sense. Could you create a test case that fails on the current version of transform() but passes with your changes? Once you have that, push to your master and I can get it merged in.

mogeko commented 1 year ago

Ah, makes sense. Could you create a test case that fails on the current version of transform() but passes with your changes? Once you have that, push to your master and I can get it merged in.

I found one.

If we try to use Bento.torrent! Go to encode huck_finn_librivox_archive.torrent and we will get an error.

iex(1)> File.read!("test/_data/huck_finn_librivox_archive.torrent") |> Bento.torrent!
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: not an already existing atom

    :erlang.binary_to_existing_atom("collections", :utf8)
    (bento 1.0.0-dev) lib/bento/metainfo.ex:87: anonymous fn/1 in Bento.Metainfo.transform/1
    (elixir 1.14.3) lib/enum.ex:1658: Enum."-map/2-lists^map/1-0-"/2
    (bento 1.0.0-dev) lib/bento/metainfo.ex:67: Bento.Metainfo.info/1
    (bento 1.0.0-dev) lib/bento/metainfo.ex:80: Bento.Metainfo.info!/1
    (bento 1.0.0-dev) lib/bento.ex:105: Bento.torrent!/1
    iex:2: (file)

Because there is a field "collections" in this torent file, it is out of our %Bento.Metainfo.MultiFile{}. Of course, BEAM could not find :collections in memory, which caused an error.

But it works for @ericls 's version (it skipped any field out of %Bento.Metainfo.MultiFile{}):

iex(1)> File.read!("test/_data/huck_finn_librivox_archive.torrent") |> Bento.torrent!
%Bento.Metainfo.Torrent{
  info: %Bento.Metainfo.MultiFile{
    "piece length": 1048576,
    pieces: <<245, 194, 107, 181, 217, 135, 191, 175, 128, 28, 49, 218, 242, 60,
      247, 168, 52, 83, 77, 117, 197, 98, 168, 210, 236, 252, 53, 170, 157, 63,
      183, 127, 167, 74, 217, 199, 32, 189, 231, 236, 27, 183, 167, 20, 42, 243,
      112, ...>>,
    private: nil,
    name: "huck_finn_librivox",
    files: [...]
  },
  announce: "http://bt1.archive.org:6969/announce",
  "announce-list": [
    ["http://bt1.archive.org:6969/announce"],
    ["http://bt2.archive.org:6969/announce"]
  ],
  "creation date": 1451670299,
  comment: "This content hosted at the Internet Archive at https://archive.org/details/huck_finn_librivox\nFiles may have changed, which prevents torrents from downloading correctly or completely; please check for an updated torrent at https://archive.org/download/huck_finn_librivox/huck_finn_librivox_archive.torrent\nNote: retrieval usually requires a client that supports webseeding (GetRight style).\nNote: many Internet Archive torrents contain a 'pad file' directory. This directory and the files within it may be erased once retrieval completes.\nNote: the file huck_finn_librivox_meta.xml contains metadata about this torrent's contents.",
  "created by": "ia_make_torrent",
  encoding: nil
}
mogeko commented 1 year ago

I believe we can use Bento.Decoder.transform/2 instead of transform/1. #24