sergiotapia / magnetissimo

Web application that indexes all popular torrent sites, and saves it to the local database.
MIT License
3k stars 190 forks source link

Leetx crawler crash on long URL #100

Closed skwerlman closed 6 years ago

skwerlman commented 6 years ago

Given a very long URL (longer than 255 chars), the Leetx crawler crashes with:

[error] GenServer Magnetissimo.Crawler.Leetx terminating
** (Postgrex.Error) ERROR 22001 (string_data_right_truncation): value too long for type character varying(255)
    (ecto) lib/ecto/adapters/sql.ex:554: Ecto.Adapters.SQL.struct/8
    (ecto) lib/ecto/repo/schema.ex:547: Ecto.Repo.Schema.apply/4
    (ecto) lib/ecto/repo/schema.ex:213: anonymous fn/14 in Ecto.Repo.Schema.do_insert/4
    (magnetissimo) lib/magnetissimo/contents.ex:9: Magnetissimo.Contents.save_torrent/1
    (magnetissimo) lib/magnetissimo/crawler/leetx.ex:83: Magnetissimo.Crawler.Leetx.process/2
    (magnetissimo) lib/magnetissimo/crawler/leetx.ex:50: Magnetissimo.Crawler.Leetx.handle_info/2
    (stdlib) gen_server.erl:616: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:686: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message: :work

I don't know if there's a clean way of solving this other than changing the URL column to :text because Leetx won't accept partial or missing slugs, even though they also have unique ids for torrents.

tchoutri commented 6 years ago

Yeah we totally have to change the type. :) (:string means varchar(255) for postgresql)

skwerlman commented 6 years ago

It looks like we'll also have to change :name and :magnet to :text as well, since these are also unlimited length. (and actually, the only migration we have already treats these as :text) haha i'm still new to ecto, and apparently :string will map to :text automatically

sergiotapia commented 6 years ago

@skwerlman Let's go ahead and use text for all of the "string" fields. Postgres will use the appropriate structure on the backend. No need for us to use string vs. text. Go all in with text.

skwerlman commented 6 years ago

PR updated to change all string fields to text