gagolews / stringi

Fast and portable character string processing in R (with the Unicode ICU)
https://stringi.gagolewski.com/
Other
304 stars 44 forks source link

MacOS 10.15.7: md5sum mismatch for icu69/data/icu4c-69_1-data-bin-l.zip #454

Closed sunnycxh0 closed 3 years ago

sunnycxh0 commented 3 years ago

Installing package v1.7.4 using R 4.1.x on MacOS 10.15.7, both from downloaded source or from cran mirrors, had this md5sum mismatch error:

checking whether the ICU data library can be downloaded... downloading the ICU data library (icudt)
output path: icu69/data/icu4c-69_1-data-bin-l.zip
trying URL 'https://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip'
downloaded 10.9 MB

md5sum mismatch for icu69/data/icu4c-69_1-data-bin-l.zip (8e9f2d4978c7e893bf4ab08d4cac4066 vs. 58ecd3e72e9d96ea2876dd89627afeb8)
trying URL 'http://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip'
downloaded 10.9 MB

md5sum mismatch for icu69/data/icu4c-69_1-data-bin-l.zip (8e9f2d4978c7e893bf4ab08d4cac4066 vs. 58ecd3e72e9d96ea2876dd89627afeb8)
icudt download failed
Error: Stopping on error
Execution halted
*** *********************************************************************
*** stringi cannot be built.
*** Failed to download the ICU data library (icudt). Stopping now.
*** For build environments that have no internet access,
*** see the INSTALL file for a workaround.
*** *********************************************************************
ERROR: configuration failed for package ‘stringi’
* removing ‘/usr/local/lib/R/lib/R/library/stringi’
* restoring previous ‘/usr/local/lib/R/lib/R/library/stringi’
Warning message:
In install.packages("~/Downloads/stringi_1.7.4.tar.gz", repos = NULL,  :
  installation of package ‘/Users/scui/Downloads/stringi_1.7.4.tar.gz’ had non-zero exit status
gagolews commented 3 years ago

I cannot reproduce this, got:

gagolews@dionysus:tmp$ wget https://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip
icu4c-69_1-data-bin-l.zip    100%[=============================================>]  10.92M  5.57MB/s    in 2.0s    

2021-09-09 12:30:56 (5.57 MB/s) - ‘icu4c-69_1-data-bin-l.zip’ saved [11454999/11454999]

gagolews@dionysus:tmp$ md5sum icu4c-69_1-data-bin-l.zip 
58ecd3e72e9d96ea2876dd89627afeb8  icu4c-69_1-data-bin-l.zip

Can you try again?

sunnycxh0 commented 3 years ago

That's very odd. i copied the exact code you showed here, and got a different checksum:

(venv) USSD-OLM-044277:icu4c_troubleshoot scui$ wget https://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip
--2021-09-08 19:43:29--  https://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘icu4c-69_1-data-bin-l.zip’

icu4c-69_1-data-bin     [                 <=>]  10.91M  3.39MB/s    in 3.4s    

2021-09-08 19:43:33 (3.25 MB/s) - ‘icu4c-69_1-data-bin-l.zip’ saved [11435875]

(venv) USSD-OLM-044277:icu4c_troubleshoot scui$ md5sum icu4c-69_1-data-bin-l.zip 
8e9f2d4978c7e893bf4ab08d4cac4066  icu4c-69_1-data-bin-l.zip
gagolews commented 3 years ago

Can you try decompressing the zip file an seeing what's inside?

I get:

unzip icu4c-69_1-data-bin-l.zip 
Archive:  icu4c-69_1-data-bin-l.zip
  inflating: LICENSE                 
  inflating: icu4c-69.1-data-bin-l-README.md  
  inflating: icudt69l.dat  

This is hosted on github, so it is highly unlikely anyone's tinkering with the file along the way. Are you behind some firewall or a VPN?

gagolews commented 3 years ago

Seems that the size of the file does not match: yours is 11435875 vs mine 11454999

sunnycxh0 commented 3 years ago

Archive: icu4c-69_1-data-bin-l.zip inflating: LICENSE
inflating: icu4c-69.1-data-bin-l-README.md
inflating: icudt69l.dat

I am connected to a VPN currently, and I'm guessing it might be Mac specific. Just tried to download this from a linux machine and got the expected checksum on CentOS Linux release 7.9.2009 (Core). I'll try synching the Mac downloaded file to linux and do another checksum check.

gagolews commented 3 years ago

Maybe setting

options(download.file.method='curl')

or

options(download.file.method='libcurl')

will help?

gagolews commented 3 years ago

That's before install.packages('stringi')

sunnycxh0 commented 3 years ago

After unzipping the files, the contents are identical, but just the zipped file are different:

scui ~/icu4c_temp $ md5sum mac/*
8e9f2d4978c7e893bf4ab08d4cac4066  mac/icu4c-69_1-data-bin-l.zip
470884eb27f0c65a5735fd25c336ab58  mac/icu4c-69.1-data-bin-l-README.md
0716a23c570a084e047fade8fbbeaffd  mac/icudt69l.dat
002d2fdc32d17f0ec06e9a47f2c0c8d0  mac/LICENSE
scui ~/icu4c_temp $ md5sum linux/*
58ecd3e72e9d96ea2876dd89627afeb8  linux/icu4c-69_1-data-bin-l.zip
470884eb27f0c65a5735fd25c336ab58  linux/icu4c-69.1-data-bin-l-README.md
0716a23c570a084e047fade8fbbeaffd  linux/icudt69l.dat
002d2fdc32d17f0ec06e9a47f2c0c8d0  linux/LICENSE

I'll try options(download.file.method='curl') to see if that makes a difference.

sunnycxh0 commented 3 years ago

That's before install.packages('stringi')

Same error unfortunately:

checking for pkg-config... /usr/local/bin/pkg-config
checking with pkg-config for the system ICU4C... no
*** pkg-config did not detect ICU4C-devel libraries installed
*** Trying with 'standard' fallback flags
checking whether an ICU4C-based project can be built... no
*** This version of ICU4C cannot be used.
*** Using the ICU 69 bundle.
checking whether we may compile src/icu69/common/putil.cpp... yes
checking whether we may compile src/icu69/i18n/number_affixutils.cpp... yes
checking whether alignof(std::max_align_t) is available... yes
checking whether the ICU data library can be downloaded... downloading the ICU data library (icudt)
output path: icu69/data/icu4c-69_1-data-bin-l.zip
trying URL 'https://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip'
downloaded 10.9 MB

md5sum mismatch for icu69/data/icu4c-69_1-data-bin-l.zip (8e9f2d4978c7e893bf4ab08d4cac4066 vs. 58ecd3e72e9d96ea2876dd89627afeb8)
trying URL 'http://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip'
downloaded 10.9 MB

md5sum mismatch for icu69/data/icu4c-69_1-data-bin-l.zip (8e9f2d4978c7e893bf4ab08d4cac4066 vs. 58ecd3e72e9d96ea2876dd89627afeb8)
icudt download failed
Error: Stopping on error
Execution halted
*** *********************************************************************
*** stringi cannot be built.
*** Failed to download the ICU data library (icudt). Stopping now.
*** For build environments that have no internet access,
*** see the INSTALL file for a workaround.
*** *********************************************************************
ERROR: configuration failed for package ‘stringi’
* removing ‘/usr/local/lib/R/lib/R/library/stringi’
* restoring previous ‘/usr/local/lib/R/lib/R/library/stringi’

The downloaded source packages are in
    ‘/private/var/folders/np/7j87bpgd1c1fxqlwr9pndps4pry50x/T/Rtmpc6qK64/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Warning message:
In install.packages("stringi") :
  installation of package ‘stringi’ had non-zero exit status
sunnycxh0 commented 3 years ago

@gagolews, since the actual contents of the package is identical, is it possible to downgrade the md5sum mismatch to a warning rather than fatal error?

gagolews commented 3 years ago

As you have already downloaded the (untruncated) correct version of icudt, try ICU Data Library and No Internet Access in https://stringi.gagolewski.com/install.html#icu-data-library-and-no-internet-access

gagolews commented 3 years ago

Re: warning - I do not consider it a safe solution.

sunnycxh0 commented 3 years ago

Pointing to the unzipped folder: image

install.packages('stringi',configure.vars="ICUDT_DIR=/Users/scui/scratch/icu4c_troubleshoot")

Unfortunately, the installation now complains about the ICU4C version and tries to download again.

*** pkg-config did not detect ICU4C-devel libraries installed
*** Trying with 'standard' fallback flags
checking whether an ICU4C-based project can be built... no
*** This version of ICU4C cannot be used.
*** Using the ICU 69 bundle.
checking whether we may compile src/icu69/common/putil.cpp... yes
checking whether we may compile src/icu69/i18n/number_affixutils.cpp... yes
checking whether alignof(std::max_align_t) is available... yes
checking whether the ICU data library can be downloaded... downloading the ICU data library (icudt)
output path: /Users/scui/scratch/glimpse_install/icu4c_troubleshoot/icu4c-69_1-data-bin-l.zip
trying URL 'https://raw.githubusercontent.com/gagolews/stringi/master/src/icu69/data/icu4c-69_1-data-bin-l.zip'
downloaded 10.9 MB
...
gagolews commented 3 years ago

It must point to the folder with the .zip file, not its uncompressed version.

sunnycxh0 commented 3 years ago

Cool. Copying over the zip file downloaded on centOS got the installation process started. Thank you!

darylz commented 2 years ago

I got the same issue when connecting thru VPN.