Closed guohuansu closed 1 week ago
Can you provide a reproducible example using reprex
please?
Can you also provide your session info? It's hard to determine whats going on without sufficient examples. Eg.
packageVersion("rfishbase")
#> [1] '5.0.0'
Created on 2024-10-11 with reprex v2.1.0
For example, for function issue: reprex::reprex(rfishbase::common_names("Gadus morhua"))
rfishbase::common_names("Gadus morhua")
#> Joining with `by = join_by(Subfamily, GenCode, FamCode)`
#> Joining with `by = join_by(FamCode)`
#> Joining with `by = join_by(Order, Ordnum, Class, ClassNum)`
#> Joining with `by = join_by(Class, ClassNum)`
#> # A tibble: 124 × 4
#> Species ComName Language SpecCode
#> <chr> <chr> <chr> <int>
#> 1 Gadus morhua Atlantic cod English 69
#> 2 Gadus morhua Bacalao English 69
#> 3 Gadus morhua Bacaleau English 69
#> 4 Gadus morhua Baccalao English 69
#> 5 Gadus morhua Baccale English 69
#> 6 Gadus morhua Baccalo English 69
#> 7 Gadus morhua Bank cod English 69
#> 8 Gadus morhua Bank fish English 69
#> 9 Gadus morhua Bastard English 69
#> 10 Gadus morhua Berry fish English 69
#> # ℹ 114 more rows
Created on 2024-10-11 with reprex v2.1.0
for session information: reprex::reprex(sessionInfo())
sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.29 withr_3.0.0 R.methodsS3_1.8.2 lifecycle_1.0.4
#> [5] magrittr_2.0.3 reprex_2.1.0 evaluate_0.23 rlang_1.1.3
#> [9] cli_3.6.2 rstudioapi_0.16.0 fs_1.6.3 R.utils_2.12.3
#> [13] R.oo_1.26.0 vctrs_0.6.5 styler_1.10.3 rmarkdown_2.27
#> [17] tools_4.2.0 R.cache_0.16.0 glue_1.6.2 purrr_1.0.2
#> [21] xfun_0.42 yaml_2.3.5 fastmap_1.1.1 compiler_4.2.0
#> [25] htmltools_0.5.8.1 knitr_1.45
Created on 2024-10-11 with reprex v2.1.0
Hi, thank you for your reply, here are the examples: reprex::reprex(packageVersion("rfishbase"))
packageVersion("rfishbase")
#> [1] '5.0.0'
Created on 2024-10-12 with reprex v2.1.0
reprex::reprex(rfishbase::common_names("Gadus morhua"))
rfishbase::common_names("Gadus morhua")
#> Warning in open.connection(con, "rb"): URL
#> 'https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb':
#> Timeout of 60 seconds was reached
#> Error in open.connection(con, "rb"): cannot open the connection to 'https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb'
Created on 2024-10-12 with reprex v2.1.0 reprex::reprex(sessionInfo())
sessionInfo()
#> R version 4.4.0 (2024-04-24 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22631)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=Chinese (Simplified)_China.utf8
#> [2] LC_CTYPE=Chinese (Simplified)_China.utf8
#> [3] LC_MONETARY=Chinese (Simplified)_China.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=Chinese (Simplified)_China.utf8
#>
#> time zone: Asia/Shanghai
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.35 fastmap_1.2.0 xfun_0.44 glue_1.8.0
#> [5] knitr_1.46 htmltools_0.5.8.1 rmarkdown_2.27 lifecycle_1.0.4
#> [9] cli_3.6.3 reprex_2.1.0 withr_3.0.1 compiler_4.4.0
#> [13] rstudioapi_0.16.0 tools_4.4.0 evaluate_0.23 yaml_2.3.8
#> [17] rlang_1.1.4 fs_1.6.4
Created on 2024-10-12 with reprex v2.1.0
@guohuansu thanks for the report. Can you see if you can open that link (https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb) (a) in your browser, and (b), from R, e.g.
httr::GET("https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb")
I'm wondering if we have a firewall issue rather than a package issue.
Also minor thing but in building the reprex it's nice if you use explicitly library
calls so that we see rfishbase
showing up in the sessionInfo()
Yes, I can see 32 lines of code after opening the link via the browser. But I can't open it from R using your code, the error is shown below:
httr::GET("https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb")
#> Error in curl::curl_fetch_memory(url, handle = handle): Timeout was reached: [huggingface.co] Failed to connect to huggingface.co port 443 after 10001 ms: Timeout was reached
I used library(rfishbase) and info. about it showed in sessionInfo() as below:
other attached packages:
[1] rfishbase_5.0.0
@guohuansu what 32 lines do you see in the browser?? You should be seeing only a small JSON blob showing the 5 releases (probably as a single line of code).
[{"type":"directory","oid":"34a1366f434dd9947de4288f208bea23b706db5f","size":0,"path":"data/fb/v19.04"},{"type":"directory","oid":"79beffc6a394f1de30e6e8172f1d7dbcb36d1fd8","size":0,"path":"data/fb/v21.06"},{"type":"directory","oid":"fd595018a45d57981999a6c4b45fdbc388f72b20","size":0,"path":"data/fb/v23.01"},{"type":"directory","oid":"05e98477e2ec0fb8aa6d90846ca9105f28430809","size":0,"path":"data/fb/v23.05"},{"type":"directory","oid":"5933af7d51bb10fed47beb735ff09ee7ba7df0e2","size":0,"path":"data/fb/v24.07"}]
If GET is failing, this is unfortunately not an issue with rfishbase
but with your R installation's libcurl bindings.
Let's also test outside of R in the terminal. What do you get with:
curl -L "https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb"
Should be the same json as above. If not, let's try with verbose mode and see if we can debug:
curl -vv -L "https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb"
I have exactly the same issue that I can access the "https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb" in my browser, but not in R or the ternimal.
So I just rolling back to the older version of 'rfishbase' and it's solved. Here's some info that may help:
packageVersion("rfishbase")
#[1] ‘3.1.9’
rfishbase::common_names("Gadus morhua")
#Importing C:\Users\15850\AppData\Roaming/R/data/R/rfishbase/comnames_fb_2104.tsv.bz2 in 1000000 line chunks:
#Rows: 324211 Columns: 35
-- Column specification -------------------------------------------------------------------------------------
Delimiter: "\t"
chr (34): autoctr, ComName, Transliteration, StockCode, SpecCode, C_Code, Language, Script, UnicodeText, ...
dbl (1): ComNamesRefNo
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
...Done! (in 30.69885 secs)
# A tibble: 124 x 4
SpecCode Species ComName Language
<chr> <chr> <chr> <chr>
1 69 Gadus morhua Atlantic cod English
2 69 Gadus morhua Bacalao English
3 69 Gadus morhua Bacaleau English
4 69 Gadus morhua Baccalao English
5 69 Gadus morhua Baccale English
6 69 Gadus morhua Baccalo English
7 69 Gadus morhua Bank cod English
8 69 Gadus morhua Bank fish English
9 69 Gadus morhua Bastard English
10 69 Gadus morhua Berry fish English
# i 114 more rows
# i Use `print(n = ...)` to see more rows
@cboettig Hi, I know why this issue happened to me and all users in China like @Laura61616 and my other collegues. Because of the firewall, we couldn't access to the huggingface website directly. Although I can set vpn to access the website via browser, it can't apply to the R envrionment. I tried to set vpn from R seperately, but failed. Then I found a huggingface mirror site (https://github.com/padeoe/hf-mirror-site, https://hf-mirror.com/), which may solve this issue faced by China located users. I've tried to download the functions from rfishbase packge and changed the code
hf <- "https://huggingface.co"
to
hf <- "https://hf-mirror.com"
Then it works well. So I wonder if you could provide an option for the users to choose which link to use, or set condtion when the first link doesn't work, it goes to the mirror site. Thank you so much!
@guohuansu Thank you very much for tracking this down! Would you be interested in sending a PR to add this option?
@cboettig Yes, I've tried to pull a request, please check whether it can work.
Hi, I updated the version to 5.0.0 today. And then every function doesn't work. When I ran a function the error shows as bellow:
Error in open.connection(con, "rb") : cannot open the connection to 'https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb' In addition: Warning message: In open.connection(con, "rb") : URL 'https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb': Timeout of 60 seconds was reached
I tried to re-install the package several times and switch on and off the VPN, but nothing changed.
Does any one can help me to solve the problem? Thanks in advance!