datagouv / datagouv-search-indicator

Keep track of udata search results performances
https://etalab.github.io/datagouv-search-indicator/
MIT License
3 stars 5 forks source link

Add datasets_wikidata.csv #22

Closed pachevalier closed 3 years ago

pachevalier commented 3 years ago

…//github.com/etalab/datagouv-search-indicator/issues/21

I'm not sure of the best strategy : erase the original datasets.csv or keep two different files : one with wikidata and one without.

We go from 49 lines to 121 !

Here is the code which has been used to concatenates files.

library(tidyverse)
read_csv("data/datasets.csv") %>% 
  bind_rows(., read_csv("data/wikidata_labels.csv")) %>%
  bind_rows(., read_csv("data/wikidata_alias.csv")) %>%
  distinct() %>%
  arrange(expected) %>%
  write_csv("data/datasets_wikidata.csv")
abulte commented 3 years ago

I think a datasets_wikidata.csv only with wikidata data (sic) is the way to go. We probably don't wan't to mix "human" expectations with "(wiki)data" ones, and appending both input files while running the script (if we want to) will be easy.

pachevalier commented 3 years ago

Ok, new datasets with only 76 lines coming from wikidata.

abulte commented 3 years ago

Can you replace NA with nothing? 😇

abulte commented 3 years ago

🎉 🙏

abulte commented 3 years ago

bind_rows(., read_csv("data/wikidata_labels‧csv")) %>%

Y'a vraiment des points médians en R ? 🤯

pachevalier commented 2 years ago

Non c'est l'extension "écriture inclusive" de firefox. à désactiver ;)