elixir-cldr / cldr

Elixir implementation of CLDR/ICU
Other
447 stars 33 forks source link

JSON Cache Parsing Issues sometimes #137

Closed halostatue closed 4 years ago

halostatue commented 4 years ago

I ran an upgrade of ex_cldr and some related libraries yesterday and on my CI solution, the build fails with on JSON parsing a cache file:

Generating LowesLoyalty.Cldr for 6 locales named ["en", "en-001", "en-CA", "fr", "fr-CA", ...] with a default locale named "en-CA"

== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected byte at position 357830: 0xE2
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1277: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:71: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1866: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1855: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1

This does not happen locally on my computer, but as soon as I rolled back, everything built just fine.

Here’s my CLDR module:

defmodule LowesLoyalty.Cldr do
  @moduledoc false

  use Cldr,
    default_locale: "en-CA",
    locales: ["en-CA", "fr-CA", "en", "fr"],
    precompile_number_formats: ["¤¤#,##0.##"],
    providers: [Cldr.Calendar, Cldr.DateTime, Cldr.List, Cldr.Number]
end

And this is what I upgraded yesterday:

As far as I can tell, the failure was consistently on the fr.json cache file—but the failure was consistent. I’m not sure why this was working on my Mac but not on our CI (using Ubuntu 18.04 etc.).

I won’t have time to deal with testing individual upgrades for a few days (big deadline coming up) to try to bisect the failure (especially since I cannot reproduce it locally, only on CI), but my gut feeling is that it’s something in 2.14.0 or 2.14.1 if it’s in ex_cldr proper.

kipcole9 commented 4 years ago

Thanks very much for the report, its appreciated although I'm sorry this is happening. I suspect this is going to be a challenge to track down as you suggest. My test suite reads and decodes all the json files - but then my tests don't run on a wide variety of platforms.

Version 2.14.0 updated the data source to CLDR version 37 so it's conceivable thats why you see the change there. I will of course try to reproduce and also see If can reproduce on Ubuntu 18.04.

Good luck with your deadline, I suspect I'll need that time too to bottom this out.

kipcole9 commented 4 years ago

Are you able to confirm the version of Jason you have installed in both dev and CI? Thanks to stack trace you provided above I know where the error is happening the json string. <<226>> is the first byte of several multibyte characters in UTF8.

I wonder if there is an encoding required to be set on the Ubuntu side to ensure the file is read as UTF8? That could at least account for why you see a difference on your dev machine versus CI. This is not an area of strength for me so I'll need to do some googling.

halostatue commented 4 years ago

The specific error has changed over builds. Here’s an earlier failure:

== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected end of input at position 364686
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1277: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:71: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1866: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1855: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1
== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected end of input at position 360573
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1277: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:71: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1866: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1855: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1

I can confirm that when the end of input has been hit, even tools like jq fail, because the JSON file is truncated.

kipcole9 commented 4 years ago

Thats very weird. I tested in a container with the same Ubuntu release with no issues. I'm just updating my cirrus CI test matrix to do some more tests.

kipcole9 commented 4 years ago

I built a simple repo at https://github.com/elixir-cldr/cldr_test_ubuntu and I run it in C irrus CI on Ubuntu and I can't yet reproduce which is very annoying. It doesn't fail on a local container and it doesn't fail in Cirrus CI. Definitely something unusual.

I'll create a branch that has some additional debugging code around the locale cache which, if you are kind enough to try when you have a chance, might give us a clue as to the problem.

kipcole9 commented 4 years ago

The hint about jq results might be helpful. Can you compare the downloaded file lengths of the locales with the following on your build and CI servers?

kip@Kips-iMac-Pro locales % ls -al fr.json fr-CA.json en.json en-CA.json
-rw-r--r--  1 kip  staff  361883 27 May 10:44 en-CA.json
-rw-r--r--  1 kip  staff  364516 27 May 10:44 en.json
-rw-r--r--  1 kip  staff  353125 27 May 10:44 fr-CA.json
-rw-r--r--@ 1 kip  staff  365732 27 May 10:44 fr.json
kipcole9 commented 4 years ago

Ahhhhhhhh I wonder! Its possible your CI has cached older versions of the json files. The locale files are ex_cldr version dependent. They don't change every release, but twice a year when CLDR releases their data, I incorporate them. And from time to time I do also generate new data that gets included in the json files. So if your CI is caching them (and therefore you aren't seeing Downloading locale ..... messages on your CI server) I suspect that is the problem.

It so happens that there has recently been a CLDR data update (incorporated in ex_cldr 1.14.0) and the addition of new data to support date/time interval formatting (incorporated in ex_cldr 1.15.0).

halostatue commented 4 years ago

I did an explicit cache clear, too. That‘s when I got the bad character value.

Here’s the local version:

-rw-r--r--   1 austin  staff  345773  5 Jun 10:16 en-001.json
-rw-r--r--   1 austin  staff  358584  5 Jun 10:20 en-CA.json
-rw-r--r--   1 austin  staff  361215  5 Jun 10:16 en.json
-rw-r--r--   1 austin  staff  347068  5 Jun 10:20 fr-CA.json
-rw-r--r--   1 austin  staff  359301  5 Jun 10:20 fr.json
-rw-r--r--   1 austin  staff  305613  5 Jun 10:16 root.json

On CI, just now:

-rw-r--r-- 1 runner runner 349074 Jun  5 21:11 en-001.json
-rw-rw-r-- 1 runner runner 361883 Jun  5 21:13 en-CA.json
-rw-r--r-- 1 runner runner 364516 Jun  5 21:11 en.json
-rw-rw-r-- 1 runner runner 352347 Jun  5 21:13 fr-CA.json
-rw-rw-r-- 1 runner runner 365732 Jun  5 21:13 fr.json
-rw-r--r-- 1 runner runner 308789 Jun  5 21:11 root.json

I got another truncated situation:

== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected end of input at position 352347
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1277: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:71: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1866: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1855: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1
halostatue commented 4 years ago

I am not seeing Downloading…. I’m wondering if there’s a way to force download on builds.

halostatue commented 4 years ago

I just did an explicit rm -rf deps/ex_cldr/priv/cldr/locales/.

I got the same (end of input) error, but different listing:

total 2040
-rw-rw-r-- 1 runner runner 349074 Jun  5 21:17 en-001.json
-rw-rw-r-- 1 runner runner 361883 Jun  5 21:17 en-CA.json
-rw-rw-r-- 1 runner runner 364516 Jun  5 21:17 en.json
-rw-rw-r-- 1 runner runner 352347 Jun  5 21:17 fr-CA.json
-rw-rw-r-- 1 runner runner 348234 Jun  5 21:17 fr.json
-rw-rw-r-- 1 runner runner 298878 Jun  5 21:17 root.json

The only thing that I can see is being cached is the kerl instance I have.

I’m going to be offline tonight, so I can’t really look at this any further right now, but this is very weird.

kipcole9 commented 4 years ago

If you're not seeing "Downloading ...." then they are already found in your _build/test/lib/ex_cldr/priv/cldr/locales directory.

kipcole9 commented 4 years ago

Or wherever you configure your :cldr_data_dir to be.

halostatue commented 4 years ago

I’m not seeing the Downloading in my local Mac build, either. After rm -rf _build/dev/lib/ex_cldr:

Compiling 311 files (.ex)
Generating LowesLoyalty.Cldr for 6 locales named ["en", "en-001", "en-CA", "fr", "fr-CA", ...] with a default locale named "en-CA"
Generated lowes_loyalty app
halostatue commented 4 years ago

I hadn’t removed from deps/ex_cldr/priv/cldr/locales, but I have done mix deps.clean ex_cldr and mix deps.get a couple of times on my local machine. Unfortunately, I’m also not currently looking at the problematic version on my machine. Maybe when I’m more awake I can try this again on my local machine with the new version and completely clean caches locally, then I’ll try on CI. Tomorrow, I think.

kipcole9 commented 4 years ago

All good - I'll keep investigating and leaving notes here - its my way of keeping track is all.

kipcole9 commented 4 years ago

I have published ex_cldr version 2.16.0 that may make it easier to find the source of the issue.

Overview of Changes

  1. Set CLDR_DEBUG environment variable to turn on some logging around the locale cache and locale file access

  2. Set the configuration option force_locale_download: true in your backend configuration to always force downloading the locale file.

Suggested triage strategy

  1. Install ex_cldr 2.16.0 and test locally

  2. In CI run with CLDR_DEBUG=true. If it appears locale files are in a location you did not expect, we can explore further.

  3. In your backend, set force_locale_download: Mix.env() == :test and confirm that locales download and are compiled successfully in CI

Changelog

Enhancements

However:

defmodule MyApp.Cldr do
  use Cldr,
    locales: ["en", "fr"],
    default_locale: "en",
    force_locale_download: Mix.env() == :prod

Bug Fixes

halostatue commented 4 years ago

OK. I’m putting a comment here as a tracking log of what I’ve done on my local machine.

  1. I switched to the branch I made after cherry-picking my dependencies upgrade off of my previous build and rebased it against the current development state. This will put ex_cldr 2.15.0 in place. Note that this includes changes other than ex_cldr, but I’m trying to get to a state more or less starting from the commits that exhibited this issue in the first place.

  2. mix deps.update ex_cldr: success

  3. rm -rf _build/{dev,test} deps/ex_cldr: success

  4. mix deps.get: success

  5. mix deps.compile: success

  6. ls -lA deps/ex_cldr/priv/cldr/locales _build/dev/lib/ex_cldr/priv/cldr/locales

    _build/dev/lib/ex_cldr/priv/cldr/locales/:
    total 2008
    -rw-r--r--   1 austin  staff  349074  6 Jun 16:08 en-001.json
    -rw-r--r--   1 austin  staff  364516  6 Jun 16:08 en.json
    -rw-r--r--   1 austin  staff  308789  6 Jun 16:08 root.json
    
    deps/ex_cldr/priv/cldr/locales/:
    total 2008
    -rw-r--r--   1 austin  staff  349074  6 Jun 16:08 en-001.json
    -rw-r--r--   1 austin  staff  364516  6 Jun 16:08 en.json
    -rw-r--r--   1 austin  staff  308789  6 Jun 16:08 root.json
  7. mix: success

  8. ls -lA deps/ex_cldr/priv/cldr/locales _build/dev/lib/ex_cldr/priv/cldr/locales

    _build/dev/lib/ex_cldr/priv/cldr/locales/:
    total 4136
    -rw-r--r--  1 austin  staff  349074  6 Jun 16:08 en-001.json
    -rw-r--r--  1 austin  staff  361883  6 Jun 16:16 en-CA.json
    -rw-r--r--  1 austin  staff  364516  6 Jun 16:08 en.json
    -rw-r--r--  1 austin  staff  353125  6 Jun 16:16 fr-CA.json
    -rw-r--r--  1 austin  staff  365732  6 Jun 16:16 fr.json
    -rw-r--r--  1 austin  staff  308789  6 Jun 16:08 root.json
    
    deps/ex_cldr/priv/cldr/locales/:
    total 4136
    -rw-r--r--  1 austin  staff  349074  6 Jun 16:08 en-001.json
    -rw-r--r--  1 austin  staff  361883  6 Jun 16:16 en-CA.json
    -rw-r--r--  1 austin  staff  364516  6 Jun 16:08 en.json
    -rw-r--r--  1 austin  staff  353125  6 Jun 16:16 fr-CA.json
    -rw-r--r--  1 austin  staff  365732  6 Jun 16:16 fr.json
    -rw-r--r--  1 austin  staff  308789  6 Jun 16:08 root.json

No errors on MacOS. Now I’m going to push the branch to CI. I’ll post the logs of the build before I switch to a debug shell. Two things of note:

==> ex_cldr_territories
Compiling 3 files (.ex)
warning: Cldr.Config.territory_info/0 is deprecated. Use Cldr.Config.territories/0
  lib/cldr/territory.ex:21

Generated ex_cldr_territories app
…
==> lowes_loyalty
Compiling 321 files (.ex)
Generating LowesLoyalty.Cldr for 6 locales named ["en", "en-001", "en-CA", "fr", "fr-CA", ...] with a default locale named "en-CA"

== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected end of input at position 352347
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1277: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:71: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1866: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1855: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1

Here’s the build steps:

  1. mix deps.get: success

  2. mix deps.compile: success

  3. ls -lA deps/ex_cldr/priv/cldr/locales _build/test/lib/ex_cldr/priv/cldr/locales

    _build/test/lib/ex_cldr/priv/cldr/locales/:
    total 1004
    -rw-r--r-- 1 runner runner 349074 Jun  6 20:48 en-001.json
    -rw-r--r-- 1 runner runner 364516 Jun  6 20:48 en.json
    -rw-r--r-- 1 runner runner 308789 Jun  6 20:48 root.json
    
    deps/ex_cldr/priv/cldr/locales/:
    total 1004
    -rw-r--r-- 1 runner runner 349074 Jun  6 20:48 en-001.json
    -rw-r--r-- 1 runner runner 364516 Jun  6 20:48 en.json
    -rw-r--r-- 1 runner runner 308789 Jun  6 20:48 root.json
  4. mix: success (wtf?)

  5. ls -lA deps/ex_cldr/priv/cldr/locales _build/dev/lib/ex_cldr/priv/cldr/locales

    _build/test/lib/ex_cldr/priv/cldr/locales/:
    total 2068
    -rw-r--r-- 1 runner runner 349074 Jun  6 20:48 en-001.json
    -rw-rw-r-- 1 runner runner 361883 Jun  6 20:51 en-CA.json
    -rw-r--r-- 1 runner runner 364516 Jun  6 20:48 en.json
    -rw-rw-r-- 1 runner runner 353125 Jun  6 20:51 fr-CA.json
    -rw-rw-r-- 1 runner runner 365732 Jun  6 20:51 fr.json
    -rw-r--r-- 1 runner runner 308789 Jun  6 20:48 root.json
    
    deps/ex_cldr/priv/cldr/locales/:
    total 2068
    -rw-r--r-- 1 runner runner 349074 Jun  6 20:48 en-001.json
    -rw-rw-r-- 1 runner runner 361883 Jun  6 20:51 en-CA.json
    -rw-r--r-- 1 runner runner 364516 Jun  6 20:48 en.json
    -rw-rw-r-- 1 runner runner 353125 Jun  6 20:51 fr-CA.json
    -rw-rw-r-- 1 runner runner 365732 Jun  6 20:51 fr.json
    -rw-r--r-- 1 runner runner 308789 Jun  6 20:48 root.json

Running it again with my build script after running mix clean…same thing. (I had forgotten to commit the upgrade to 2.16.0, so I’ll be running these tests again.)

halostatue commented 4 years ago

Following your suggested triage strategy:

  1. __Install ex_cldr 2.16.0 and test locally.__ Worked locally. Failed in CI when using my build script (which uses kiex and kerl to set specific Erlang and Elixir versions).

  2. __In CI run with CLDR_DEBUG=true. If it appears locale files are in a location you did not expect, we can explore further.__ Modified my build script to set this, but it still failed…and I see no extra logging information. I’ll try this with a shell soon; adding the force_download flag first.

==> lowes_loyalty
Compiling 321 files (.ex)
Generating LowesLoyalty.Cldr for 6 locales named ["en", "en-001", "en-CA", "fr", "fr-CA", ...] with a default locale named "en-CA"

== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected end of input at position 364686
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1297: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:73: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1895: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1884: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1
  1. In your backend, set force_locale_download: Mix.env() == :test and confirm that locales download and are compiled successfully in CI. Done. Still failed, no additional information.

I just switched my :logger configuration to :debug to see if that’s why I can’t see any more information.

kipcole9 commented 4 years ago

Ah, yes, I should have noted that I am emitting log information at :debug level for CLDR_DEBUG=true. Sorry for not mentioning that in the script. The locale downloading log messages are emitted at :info level.

I am very perplexed still but definitely not giving up. Thanks for sticking with this.

halostatue commented 4 years ago

I made a couple of changes to the downloaded version in deps to remove the if test from maybe_log. Now it always does Logger.info because I couldn’t see any of the log messages at any time.

This is what I see now:

==> ex_cldr_lists
Compiling 3 files (.ex)

21:47:44.368 [info]  Compiler locale cache: Created cache :cldr_locales in :ets

21:47:44.371 [info]  Compiler locale cache: Miss for "root". Reading and decoding the locale file.

21:47:44.371 [info]  Cldr.Config reading locale file "/home/runner/lowes-loyalty-web/_build/test/lib/ex_cldr/priv/cldr/locales/root.json"
Generated ex_cldr_lists app
==> lowes_loyalty
Compiling 321 files (.ex)

21:47:45.815 [info]  Downloaded locale "en"

21:47:45.918 [info]  Downloaded locale "en-001"

21:47:45.998 [info]  Downloaded locale "en-CA"

21:47:46.080 [info]  Downloaded locale "fr"

21:47:46.328 [info]  Downloaded locale "fr-CA"

21:47:46.398 [info]  Downloaded locale "root"
Generating LowesLoyalty.Cldr for 6 locales named ["en", "en-001", "en-CA", "fr", "fr-CA", ...] with a default locale named "en-CA"

21:47:46.558 [info]  Compiler locale cache: Miss for "en". Reading and decoding the locale file.

21:47:46.558 [info]  Cldr.Config reading locale file "/home/runner/lowes-loyalty-web/_build/test/lib/ex_cldr/priv/cldr/locales/en.json"

21:47:46.599 [info]  Compiler locale cache: Miss for "en-001". Reading and decoding the locale file.

21:47:46.599 [info]  Cldr.Config reading locale file "/home/runner/lowes-loyalty-web/_build/test/lib/ex_cldr/priv/cldr/locales/en-001.json"

== Compilation error in file lib/lowes_loyalty/cldr.ex ==
** (Jason.DecodeError) unexpected end of input at position 346863
    lib/jason.ex:78: Jason.decode!/2
    (ex_cldr) lib/cldr/config/config.ex:1297: Cldr.Config.do_get_locale/3
    (ex_cldr) lib/cldr/compiler_locale_cache.ex:73: Cldr.Locale.Cache.do_get_locale/2
    (ex_cldr) lib/cldr/config/config.ex:1895: Cldr.Config.decimal_formats_for/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1327: Enum."-map/2-lists^map/1-0-"/2
    (ex_cldr) lib/cldr/config/config.ex:1884: Cldr.Config.decimal_format_list/1
    lib/cldr/number/formatter/decimal_formatter.ex:778: Cldr.Number.Formatter.Decimal.define_to_string/1
halostatue commented 4 years ago

So the error is in locales/en-001.json and this is what jq reports: parse error: Unfinished JSON term at EOF at line 1, column 346863.

I’ve SCPed them from CI.

halostatue commented 4 years ago

For some reason, it’s downloading 2Kb shorter than the rest of the locale files.

halostatue commented 4 years ago

Specifically, it’s cut off after mass_stone.

kipcole9 commented 4 years ago

Really appreciate the perseverance. Any chance you could zip up the en-001.json file you have so I can take a look. Its very very strange that its being truncated somewhere along the line.

At least with this last iteration we are closer to understanding the issue - if not yet the cause.

eIther a public URL, or feel free to email me at the email address on this repo.

halostatue commented 4 years ago

Sure. If it’s small enough, I’ll attach it to the ticket. I’m doing one other thing. I’ve added a log message of the URL where the json file is downloaded from to the deps/ instance so that I can try curling the JSON file to see if there’s something else going on and if it’s something weird with the CI setup.

halostatue commented 4 years ago

Here’s the cURL output:

curl -vi -O https://raw.githubusercontent.com/elixir-cldr/cldr/v2.16.0/priv/cldr/locales/en-001.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 151.101.0.133...
* TCP_NODELAY set
* Connected to raw.githubusercontent.com (151.101.0.133) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [112 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3062 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [300 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [37 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=GitHub, Inc.; CN=www.github.com
*  start date: May  6 00:00:00 2020 GMT
*  expire date: Apr 14 12:00:00 2022 GMT
*  subjectAltName: host "raw.githubusercontent.com" matched cert's "*.githubusercontent.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
} [5 bytes data]
> GET /elixir-cldr/cldr/v2.16.0/priv/cldr/locales/en-001.json HTTP/1.1
> Host: raw.githubusercontent.com
> User-Agent: curl/7.58.0
> Accept: */*
>
{ [5 bytes data]
< HTTP/1.1 200 OK
< Connection: keep-alive
< Content-Length: 349074
< Cache-Control: max-age=300
< Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
< Content-Type: text/plain; charset=utf-8
< ETag: "96bff550079a088d42dff4963f0dc677aceae883c86cff9e29efb72323c068f7"
< Strict-Transport-Security: max-age=31536000
< X-Content-Type-Options: nosniff
< X-Frame-Options: deny
< X-XSS-Protection: 1; mode=block
< Via: 1.1 varnish (Varnish/6.0)
< X-GitHub-Request-Id: 4F00:1E1E:AFDD2:D8285:5EDC0D32
< Accept-Ranges: bytes
< Date: Sat, 06 Jun 2020 22:04:20 GMT
< Via: 1.1 varnish
< X-Served-By: cache-fra19176-FRA
< X-Cache: MISS, HIT
< X-Cache-Hits: 0, 1
< X-Timer: S1591481061.997607,VS0,VE2
< Vary: Authorization,Accept-Encoding
< Access-Control-Allow-Origin: *
< X-Fastly-Request-ID: 010530320f21503bba0a2b630be0da5e47c79b7b
< Expires: Sat, 06 Jun 2020 22:09:20 GMT
< Source-Age: 10
<
{ [5 bytes data]
100  340k  100  340k    0     0  3277k      0 --:--:-- --:--:-- --:--:-- 3277k
* Connection #0 to host raw.githubusercontent.com left intact

Curl downloaded it just fine.

halostatue commented 4 years ago

en-001.json.gz

kipcole9 commented 4 years ago

I've been trying to think about why this one file - which isn't the largest but is larger than root.json isn't being fully downloaded or stored. I can add some debug code to check the size of the downloaded file just in case its an issue between download and file save.

One very small and highly unlikely thing: I use the http header:

  defp headers do
    [{'Connection', 'close'}]
  end

And actually I don't recall why I set that. If you are game to change the downloaded deps code to:

  defp headers do
    []
  end

Thats the only thing I can think of. Its in deps/ex_cldr/lib/cldr/install.ex

kipcole9 commented 4 years ago

Ok, your curl experiement is helpful and perhaps therefore is an issue related to early closing of the connection before the receive buffer is filled at some layer lower than :httpc.

Or ....... and this sounds like a possibility. I'm taking the downloaded charlist and converting it with :erlang.list_to_binary/1 which may be any issue and I don't believe is necessary.

I'll work on this second part immediately.

halostatue commented 4 years ago

I’m doing a bit more… I added logging of the headers and String.length(:erlang.list_to_binary(body) to the download.

22:09:57.515 [info]  [
  {'cache-control', 'max-age=300'},
  {'connection', 'close'},
  {'date', 'Sat, 06 Jun 2020 22:09:57 GMT'},
  {'via', '1.1 varnish (Varnish/6.0)'},
  {'accept-ranges', 'bytes'},
  {'etag', '"96bff550079a088d42dff4963f0dc677aceae883c86cff9e29efb72323c068f7"'},
  {'vary', 'Authorization,Accept-Encoding'},
  {'content-length', '349074'},
  {'content-type', 'text/plain; charset=utf-8'},
  {'expires', 'Sat, 06 Jun 2020 22:14:57 GMT'},
  {'content-security-policy', 'default-src \'none\'; style-src \'unsafe-inline\'; sandbox'},
  {'strict-transport-security', 'max-age=31536000'},
  {'x-content-type-options', 'nosniff'},
  {'x-frame-options', 'deny'},
  {'x-xss-protection', '1; mode=block'},
  {'x-github-request-id', '4F00:1E1E:AFDD2:D8285:5EDC0D32'},
  {'x-served-by', 'cache-fra19155-FRA'},
  {'x-cache', 'MISS, HIT'},
  {'x-cache-hits', '0, 1'},
  {'x-timer', 'S1591481397.373365,VS0,VE93'},
  {'access-control-allow-origin', '*'},
  {'x-fastly-request-id', 'bd9fffc713cd4def2bc4aa7def7e8268f609074d'},
  {'source-age', '89'}
]

22:09:57.543 [info]  343880
kipcole9 commented 4 years ago

Hmmm, not 349074 which I'd expect. Is suspect now its the use of :erlang.list_to_binary which is not UTF8 aware. I have new version 1.16.1-rc.0 coming in 3 minutes.

halostatue commented 4 years ago

Now I’m running it with length(body):

22:19:03.502 [info] charlist length 337266

halostatue commented 4 years ago

Last thing before I need to go eat:

22:26:21.960 [info] 349074.0

That’s (:erlang.list_to_bistring(body) |> :erlang.bit_size()) / 8—and it looks the right size.

I’ll pick this up after.

kipcole9 commented 4 years ago

Got it, thanks. Will have a new version ready when you get back.

kipcole9 commented 4 years ago

If you do a update mix.exs to set {:ex_cldr, "~> 2.16-rc", override: true} and mix deps.get then that version:

  1. Does not do Connection: close on the :httpc request.
  2. Does not convert the charlist to binary before writing the locale file to disk

I appreciate this is disruptive since you've added some debug code but I think the issue may be the extraneous use of :erlang.list_to_binary/1 which is not UTF8 friendly. And therefore some operating system / file system / encoding differences causing the issue. Its speculation until you try it out of course.

halostatue commented 4 years ago

I will do so. The debug code disappeared anyway because the shell session expired. ~I’ll do so in a couple of hours.~

This compiled with no errors.

halostatue commented 4 years ago

This worked in a new shell. Running a version with the changes pushed to let you know. I’m hopping offline for at least an hour, so I’ll let you know then, but it’s very promising.

halostatue commented 4 years ago

:shipit: The build is good.

kipcole9 commented 4 years ago

Woohoo. That was a bit of a chase. I'll push a formal release in the next 20 minutes. Really appreciate your support and collaboration.

kipcole9 commented 4 years ago

Published to hex as ex_cldr version 2.16.1.

Changelog

Bug Fixes