lau / tzdata

tzdata for Elixir. Born from the Calendar library.
MIT License
303 stars 71 forks source link

Would it be possible to remove the dependency on hackney? #62

Closed mindreframer closed 6 years ago

mindreframer commented 6 years ago

First: thanks for maintaining tzdata! Background story:

I was playing today with SchedEx Empex NYC 2018 Talk + Github and noticed that tzdata pulls quite some transient dependencies.

$ mix app.tree
schedextest
├── elixir
├── logger
│   └── elixir
├── cortex
│   ├── elixir
│   ├── logger
│   └── file_system
│       ├── elixir
│       └── logger
└── sched_ex
    ├── elixir
    ├── crontab
    │   ├── elixir
    │   └── logger
    ├── logger
    └── timex
        ├── elixir
        ├── logger
        ├── tzdata
        │   ├── elixir
        │   ├── hackney
        │   │   ├── crypto
        │   │   ├── asn1
        │   │   ├── public_key
        │   │   │   ├── asn1
        │   │   │   └── crypto
        │   │   ├── ssl
        │   │   │   ├── crypto
        │   │   │   └── public_key
        │   │   ├── idna
        │   │   │   └── unicode_util_compat
        │   │   ├── mimerl
        │   │   ├── certifi
        │   │   ├── ssl_verify_fun
        │   │   │   └── ssl
        │   │   └── metrics
        │   └── logger
        ├── gettext
        │   ├── elixir
        │   └── logger
        └── combine
            └── elixir

The only place where hackney is needed is in this module: https://github.com/lau/tzdata/blob/06da3182cc948902bc42a3cec86552744382c229/lib/tzdata/data_loader.ex doing a get / head requests. Nothings fancy and also not performance-critical because it only has to download 300k file once in a while.

One option would be to use something like https://github.com/alexandrubagu/simplehttp, the other would be to include (embed / vendor-in) a small module that wraps :httpc.request. That way tzdata would be independent of any http client implementations by using the always present :httpc implementation.

I understand that this does not look like high prio issue, yet considering the popularity of timex and other libraries depending on tzdata it might be sensible to shed some of the dependencies longterm to make including tzdata as painless as possible.

Here are some numbers to make it more visual:

https://hex.pm/packages?search=depends%3Acalendar -> 29 dependencies of calender, guardian and quantum being the most popular.

https://hex.pm/packages?search=depends%3Atimex -> 209 dependencies of timexwith a lot of popular packages...

Hope this will get proper consideration and have a nice day!

lau commented 6 years ago

Hi. I would like to get rid of the hackney dependency if there was an alternative that could do the same job.

Erlang has a built in HTTP client: HTTPC. The reason to use hackney is security. HTTPC does not validate HTTPS requests.

One alternative I have considered is to maintain a service that signs tzdata releases, include a key with the Elixir Tzdata package so that it is possible to download over HTTP (as opposed to HTTPS) and still being able to validate the data. Users of Elixir Tzdata would then have to trust that service instead of IANA.

mindreframer commented 6 years ago

@lau OK, I understand. Thanks for quick response!

Maintaining your own service seems like a lot of hassle, though... It would be great to have other suggestions from the community, maybe there is an alternative that that does not involve many dependencies.

Please keep this issue open for some time, we should collect ideas here. I will spend some time brainstorming on this.

lau commented 6 years ago

I'm closing it for now, but feel free to comment here and it can be reopened. Or open a new issue.

mindreframer commented 6 years ago

@lau I agree, there is currently no immediate actionable steps possible. Thanks for the clarification and discussion.

mindreframer commented 5 years ago

@lau I found an article that describes how to enable certificate checking with :httpc. https://blog.voltone.net/post/7

iex(4)> cacertfile = :code.priv_dir(:http_clients) ++ '/cacert.pem'
...
iex(5)> :httpc.set_options(socket_opts: [verify: :verify_peer, cacertfile: cacertfile])
:ok
iex(6)> :httpc.request('https://blog.voltone.net')
{:ok, {{'HTTP/1.1', 200, 'OK'}, ...}}
iex(7)> :httpc.request('https://selfsigned.voltone.net')
{:error,
 {:failed_connect,
  [{:to_address, {'selfsigned.voltone.net', 443}},
   {:inet,
    [:inet, {:verify, :verify_peer},
     {:cacertfile,
      '[...]/http_clients/_build/dev/lib/http_clients/priv/cacert.pem'}],
    {:tls_alert, 'bad certificate'}}]}}

the trust store can be taken from certifi: https://github.com/certifi/erlang-certifi/tree/master/priv , like :hackney already does it.

Wdyt? Just found it and wanted to share.

mindreframer commented 5 years ago

here is a working implementation that actually checks SSL certificate validity without :hackney

defmodule Tzdata.Http do
  @type header :: {String.t(), String.t()}
  @type httpresponse ::
          {:ok, integer, String.t(), [header]} | {:error, String.t()} | {:error, term}
  @callback get(String.t()) :: httpresponse
  @callback head(String.t()) :: httpresponse
end

defmodule Tzdata.HTTPCAdapter do
  @behaviour Tzdata.Http

  @impl Tzdata.Http
  def get(url) do
    setup()
    with {:ok, {{_, status_code, _}, headers_as_charlists, body_as_charlist}} <-
           :httpc.request(url |> convert_to_list) do
      {:ok, status_code, convert_body(body_as_charlist), convert_headers(headers_as_charlists)}
    end
  end

  @impl Tzdata.Http
  def head(url) do
    setup()
    with {:ok, {{_, status_code, _}, headers_as_charlists, body_as_charlist}} <-
        :httpc.request(:head, {url |> convert_to_list, []}, [], []) do
      {:ok, status_code, convert_body(body_as_charlist), convert_headers(headers_as_charlists)}
    end
  end

  defp convert_headers(headers) when is_list(headers) do
    headers
    |> Enum.map(fn {key, value} ->
      {
        key |> convert_header_key(),
        value |> convert_to_string()
      }
    end)
  end

  defp setup do
    :httpc.set_options(socket_opts: [verify: :verify_peer, cacertfile: certfile()])
  end

  defp certfile do
    :code.priv_dir(:certifi) ++ '/cacerts.pem'
  end

  defp convert_header_key(key) do
    key
    |> List.to_string()
    |> String.split("-")
    |> Enum.map(&Macro.camelize(&1))
    |> Enum.join("-")
  end

  defp convert_body(body) do
    body |> convert_to_string()
  end

  defp convert_to_string(str) when is_binary(str), do: str
  defp convert_to_string(list) when is_list(list), do: list |> List.to_string()

  defp convert_to_list(str) when is_binary(str), do: str |> String.to_charlist()
  defp convert_to_list(list) when is_list(list), do: list
end

defmodule Tzdata.HackneyAdapter do
  @behaviour Tzdata.Http

  @impl Tzdata.Http
  def get(url) do
    with {:ok, 200, headers, client_ref} <- :hackney.get(url, [], "", follow_redirect: true),
         {:ok, body} <- :hackney.body(client_ref) do
      {:ok, 200, body, headers}
    end
  end

  @impl Tzdata.Http
  def head(url) do
    :hackney.head(url, [], "", [])
  end
end

defmodule Runner do
  def run_hackney do
    run(Tzdata.HackneyAdapter)
  end

  def run_httpc do
    run(Tzdata.HTTPCAdapter)
  end

  def run(adapter_module) do
    adapter_module.get("https://data.iana.org/time-zones/tzdata-latest.tar.gz") |> IO.inspect()
    adapter_module.head("https://data.iana.org/time-zones/tzdata-latest.tar.gz") |> IO.inspect()
  end
end

running it:

iex(1)> Tzdata.HackneyAdapter.head("https://data.iana.org/time-zones/tzdata-latest.tar.gz")
{:ok, 200,
 [
   {"Accept-Ranges", "bytes"},
   {"Cache-Control", "max-age=86400"},
   {"Content-Type", "application/x-gzip"},
   {"Date", "Wed, 12 Dec 2018 10:45:23 GMT"},
   {"Etag", "\"59748-57934b7bc096a\""},
   {"Expires", "Thu, 13 Dec 2018 10:45:23 GMT"},
   {"Last-Modified", "Sat, 27 Oct 2018 12:10:11 GMT"},
   {"Referrer-Policy", "origin-when-cross-origin"},
   {"Server", "ECAcc (mic/9B10)"},
   {"Strict-Transport-Security", "max-age=48211200; preload"},
   {"X-Cache", "HIT"},
   {"X-Frame-Options", "SAMEORIGIN"},
   {"Content-Length", "366408"}
 ]}
iex(2)> Tzdata.HTTPCAdapter.head("https://data.iana.org/time-zones/tzdata-latest.tar.gz")
{:ok, 200,
 [
   {"Cache-Control", "max-age=86400"},
   {"Date", "Wed, 12 Dec 2018 10:45:27 GMT"},
   {"Accept-Ranges", "bytes"},
   {"Etag", "\"59748-57934b7bc096a\""},
   {"Server", "ECAcc (mic/9B10)"},
   {"Content-Length", "366408"},
   {"Content-Type", "application/x-gzip"},
   {"Expires", "Thu, 13 Dec 2018 10:45:27 GMT"},
   {"Last-Modified", "Sat, 27 Oct 2018 12:10:11 GMT"},
   {"Referrer-Policy", "origin-when-cross-origin"},
   {"Strict-Transport-Security", "max-age=48211200; preload"},
   {"X-Cache", "HIT"},
   {"X-Frame-Options", "SAMEORIGIN"}
 ]}
iex(3)> Tzdata.HackneyAdapter.get("https://data.iana.org/time-zones/tzdata-latest.tar.gz")
{:ok, 200,
 <<31, 139, 8, 0, 0, 0, 0, 0, 2, 3, 236, 91, 203, 114, 227, 70, 150, 173, 109,
   225, 43, 114, 228, 137, 144, 228, 224, 3, 32, 41, 82, 162, 219, 238, 214,
   187, 202, 86, 149, 20, 69, 201, 158, 174, 137, 137, 138, ...>>,
 [
   {"Accept-Ranges", "bytes"},
   {"Cache-Control", "max-age=86400"},
   {"Content-Type", "application/x-gzip"},
   {"Date", "Wed, 12 Dec 2018 10:47:05 GMT"},
   {"Etag", "\"59748-57934b7bc096a\""},
   {"Expires", "Thu, 13 Dec 2018 10:47:05 GMT"},
   {"Last-Modified", "Sat, 27 Oct 2018 12:10:11 GMT"},
   {"Referrer-Policy", "origin-when-cross-origin"},
   {"Server", "ECAcc (mic/9B10)"},
   {"Strict-Transport-Security", "max-age=48211200; preload"},
   {"X-Cache", "HIT"},
   {"X-Frame-Options", "SAMEORIGIN"},
   {"Content-Length", "366408"}
 ]}
iex(4)> Tzdata.HTTPCAdapter.get("https://data.iana.org/time-zones/tzdata-latest.tar.gz")
{:ok, 200,
 <<31, 194, 139, 8, 0, 0, 0, 0, 0, 2, 3, 195, 172, 91, 195, 139, 114, 195, 163,
   70, 194, 150, 194, 173, 109, 195, 161, 43, 114, 195, 164, 194, 137, 194, 144,
   195, 164, 195, 160, 3, 32, 41, 82, 194, 162, 195, 155, ...>>,
 [
   {"Cache-Control", "max-age=86400"},
   {"Date", "Wed, 12 Dec 2018 10:47:18 GMT"},
   {"Accept-Ranges", "bytes"},
   {"Etag", "\"59748-57934b7bc096a\""},
   {"Server", "ECAcc (mic/9B10)"},
   {"Content-Length", "366408"},
   {"Content-Type", "application/x-gzip"},
   {"Expires", "Thu, 13 Dec 2018 10:47:18 GMT"},
   {"Last-Modified", "Sat, 27 Oct 2018 12:10:11 GMT"},
   {"Referrer-Policy", "origin-when-cross-origin"},
   {"Strict-Transport-Security", "max-age=48211200; preload"},
   {"X-Cache", "HIT"},
   {"X-Frame-Options", "SAMEORIGIN"}
 ]}