Closed fred-udina closed 2 years ago
It is certainly quite strange, because using the latest version this works for me perfecty.
Can you please check that your Internet connection allows to to retrieve this URL for reo 1031?
Also, what happens when you run
CEOmeta()
Are you able to get the metadata of all the CEO studies?
Hi, Thanks for your quick answer. I could indeed get the zip file from the link you asked for. I have the last version of CEOdata (just installed today) and also R and RStudio recent versions, I work from my UPF office, so from the catalan universities network. And:
> CEOmeta()
A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file'
NULL
Bona tarda, @fred-udina ,
Definitively a sort of network-related problem. Not yours. I am assuming you are on Windows.
What do you get with CEOdata()
without arguments?
My suspicion right now is an obscure problem with how R in Windows deals with secure servers, specifically servers from the gencat, that during development have been proved to be troublesome. I don't have an easy access to a Windows machine, but let me inspect it.
And thank you very much for reporting it. So far we have had other users (also working from the same phisical location and machines) and nothing has popped up. So please let me inspect it.
I work with MacOS 10.15.7. I hope that CEOdata will work fine with Windows because it is what my students will mainly use.
CEOdata() with no args works for me:
> CEOdata()
Downloading the barometer.
trying URL 'https://ceo.gencat.cat/web/.content/20_barometre/Matrius_BOP/Microdades_barometre.zip'
Content type 'application/zip' length 10111044 bytes (9.6 MB)
==================================================
downloaded 9.6 MB
Converting the original data into R. This may take a while.
Post-processing the data. This may take a while.
# A tibble: 37,838 × 962
PONDERA ORDRECINE ORDRE_R…¹ REO METOD…² BOP_NUM ANY MES DIA HORA_…³ HORA_…⁴ DATA_INI DATA_FIN DURADA FASE ENQUESTAD…⁵
<dbl> <dbl> <dbl> <dbl> <fct> <fct> <dbl> <dbl> <dbl> <time> <time> <date> <date> <dbl> <fct> <dbl>
but I still have
> d <- CEOdata(reo="1031")
A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file'
A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file'
Error in if (!is.na(url.reo)) { : argument is of length zero
It doesn't look like any problem with the network:
~$ wget https://ceo.gencat.cat/web/.content/30_estudis/repositorimatrius/2022/Microdades_anonimitzades_1031.zip
--2022-09-14 15:15:17-- https://ceo.gencat.cat/web/.content/30_estudis/repositorimatrius/2022/Microdades_anonimitzades_1031.zip
Resolving ceo.gencat.cat (ceo.gencat.cat)... 23.39.109.188
Connecting to ceo.gencat.cat (ceo.gencat.cat)|23.39.109.188|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 458386 (448K) [application/zip]
Saving to: ‘Microdades_anonimitzades_1031.zip’
Microdades_anonimit 100%[===================>] 447,64K --.-KB/s in 0,06s
2022-09-14 15:15:18 (7,35 MB/s) - ‘Microdades_anonimitzades_1031.zip’ saved [458386/458386]
I just tried it on my home Mac, macOS 12.5.1, R 4.2.1, CEOdata 1.2.0.1
The problem is the same with CEOdata(reo = "1031")
.
Yes, I have managed to try it on a Windows machine and that is also the case. My GNU/Linux, though, does work well. I'm on it.
Just to play with it, I tried R in a gnu/linux virtual box running in my mac. The same problem appears when asking for reo=1031, no when asking CEOdata() without args.
Thank you @fred-udina, for helping me out.
I think I have found it.
Can you please also install "curl" (install.packages('curl')
) and then repeat it and report back? Thank you.
I have found that for some reason that I have to understand 'curl' is no more loading and when calling jsonline to retrieve the metadata it does not work.
The main merged barometer is not affected because it does not load its data from the metadata.
A Temporary shortcut would be to do something like:
CEOdata() |>
filter(REO == "1031")
in order to achieve the same behaviour than with CEOdata(reo = "1031")
.
But it must work properly anyway.
Yes, it is. I've had some problems with urls some time ago that it was fixed by curl package!
> install.packages("curl")
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/curl_4.3.2.tgz'
Content type 'application/x-gzip' length 861741 bytes (841 KB)
==================================================
downloaded 841 KB
The downloaded binary packages are in
/var/folders/n3/dyjkdb8d66vbrchsszv6vmzm0000gp/T//RtmpD22Vy7/downloaded_packages
> library(curl)
Using libcurl 7.79.1 with LibreSSL/3.3.6
> CEOdata(reo = "1031") -> d
trying URL 'https://ceo.gencat.cat/web/.content/30_estudis/repositorimatrius/2022/Microdades_anonimitzades_1031.zip'
Content type 'application/zip' length 458386 bytes (447 KB)
==================================================
downloaded 447 KB
Converting the original data into R. This may take a while.
>
OK, thank you for confirming, @fred-udina . I will leave this issue opened until we decide what to do with 'curl' that depends on jsonline (as seen by the issue aforementioned).
This is quite weird. In my main mac CEOdata(reo="1031")
wasn't working. Then I install curl, I do NOT attach it but then it works.
> d <- CEOdata(reo="1031")
A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file'
A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file'
Error in if (!is.na(url.reo)) { : argument is of length zero
> install.packages("curl")
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/curl_4.3.2.tgz'
Content type 'application/x-gzip' length 861741 bytes (841 KB)
==================================================
downloaded 841 KB
The downloaded binary packages are in
/var/folders/n3/dyjkdb8d66vbrchsszv6vmzm0000gq/T//RtmptYlrfo/downloaded_packages
> d <- CEOdata(reo="1031")
trying URL 'https://ceo.gencat.cat/web/.content/30_estudis/repositorimatrius/2022/Microdades_anonimitzades_1031.zip'
Content type 'application/zip' length 458386 bytes (447 KB)
==================================================
downloaded 447 KB
Converting the original data into R. This may take a while.
>
Without investigating more, I would say that this is reasonable, as "curl", the package, also touches other functions that then gain "curl goodies" such as encryption, etc... Also, jsonline itself, which is called in CEOdata(), loads curl silently if it is available in the system. So it is expected behaviour.
Missatge de fred-udina @.***> del dia dj., 15 de set. 2022 a les 10:38:
This is quite weird. In my main mac CEOdata(reo="1031") wasn't working. Then I install curl, I do NOT attach it but then it works.
d <- CEOdata(reo="1031") A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file' A problem downloading the metadata has occurred. The server may be temporarily down, or the file name has changed. Please try again later or open an issue at https://github.com/ceopinio/CEOdata indicating 'Problem with metadata file' Error in if (!is.na(url.reo)) { : argument is of length zero install.packages("curl") trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/curl_4.3.2.tgz' Content type 'application/x-gzip' length 861741 bytes (841 KB)
downloaded 841 KB
The downloaded binary packages are in /var/folders/n3/dyjkdb8d66vbrchsszv6vmzm0000gq/T//RtmptYlrfo/downloaded_packages
d <- CEOdata(reo="1031") trying URL 'https://ceo.gencat.cat/web/.content/30_estudis/repositorimatrius/2022/Microdades_anonimitzades_1031.zip' Content type 'application/zip' length 458386 bytes (447 KB)
downloaded 447 KB
Converting the original data into R. This may take a while.
— Reply to this email directly, view it on GitHub https://github.com/ceopinio/CEOdata/issues/6#issuecomment-1247774606, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFEW45XIDI65Z3S6XEEYYDV6LN7ZANCNFSM6AAAAAAQMH7HLI . You are receiving this because you commented.Message ID: @.***>
-- Xavier
Well, some say that R is not a real programming language...
Just a question: are you planning to declare CEOdata package as dependent on curl? Otherwise I should instruct my students to load it before using CEOdata. Any small trouble is for them a demotivating disaster.
So far there is a message (pending approval in the main repository of the ceo) about the temporal need to ensure tha curl is installed (you can see it in my fork.
The proper way to proceed would be to wait on the input of jsonline, because that is where the dependency issue lies. In case this is not successful, then we could add a dependency, but it is not my prefered option, as CEOdata depends on jsonlite, which is the package involved in that, and there the dependency on curl is not resolved.
For students, you can instruct them to do something like this at the very beginning (I do it myself in my classes). It is very convenient because from the first day of the course all the packages are properly loaded. Of course, you can adapt it to your needs:
install.packages(c("CEOdata", "curl", "ggplot2", "dplyr", tidyr", "ggmcmc"), dependencies = TRUE)
Yes, I agree with your approach. Thanks.
So let's wait for the reply and keep this issue open for some more time.
This has been solved at 'jsonlite' (see jsonlite's issue), and now 'curl' is no more a dependency. Still, we need to keep the information in the main site to make users aware of the necessity of 'curl', as the CRAN version still hasn't the new code without 'curl'.
Perfect, thank you again.
Happy to know that CEOdata package exists! But in my first atempt... I'm I doing anything wrong? Frederic