KTH-Library / kthcorpus

R package to support workflows related to the corpus of publications from KTH
https://kth-library.github.io/kthcorpus
GNU Affero General Public License v3.0
0 stars 1 forks source link

Problems getting/reading ug_kthid_orcid.csv from Minio #143

Open awandahl opened 9 months ago

awandahl commented 9 months ago

Background: trying to render mods.qmd locally Apart from the errors below, the file ug_kthid_orcid.csv doesn't seem to be updated since May this year [??]

kthcorpus::kthid_orcid() Error in select():
! Can't subset columns past the end. ℹ Location 2 doesn't exist. ℹ There is only 1 column.

Quitting from lines 284-408 [unnamed-chunk-7] (mods.qmd) Error in select(): ! Can't subset columns past the end. ℹ Location 2 doesn't exist. ℹ There is only 1 column. Backtrace:

  1. kthcorpus::kthid_orcid()
  2. dplyr:::select.data.frame(read_from_minio("ug_kthid_orcid.csv"), kthid = 1, orcid = 2)
mskyttner commented 9 months ago

The data about kthid/orcid pairs comes from "ug_kthid_orcid.csv" and "diva_kthid_orcid.csv" combined.

The first dataset cannot be harvested from a GHA due to VPN required (see data-raw/ug_kthid_orcid_upload.R which could be scheduled to run on an internal server instead). This script has been run manually now, so data should be fresh.

The second dataset can be harvested from DiVA (see data-raw/orcid_kthid_diva.R). This script has been run manually now, so data should be fresh.

I tried running the call now and did not get any exceptions:

> kthcorpus::kthid_orcid()
# A tibble: 6,406 × 2  
... etc ...