concepticon / norare-data

Cross-Linguistic Norms, Ratings, and Relations for Words and Concepts
Other
15 stars 1 forks source link

Invalid data for specified datatype in VanHeuven-2014-Frequency #179

Closed xrotwang closed 2 years ago

xrotwang commented 2 years ago

The variable ENGLISH_FREQUENCY_CBEEBIES in VanHeuven-2014-Frequency is specified as integer, but the values cannot be read as such:

$ csvcut -t -c ENGLISH_FREQUENCY_CBEEBIES raw/norare-data/concept_set_meta/VanHeuven-2014-Frequency/VanHeuven-2014-Frequency.tsv
ENGLISH_FREQUENCY_CBEEBIES
0.0
23.0
11.0
0.0
7.0
0.0
0.0
16.0
20.0
67.0
61.00000000000001
xrotwang commented 2 years ago

Seems to be a problem with all VanHeuven variables:

VanHeuven-2014-Frequency ENGLISH_FREQUENCY
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_CBEEBIES
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_CBBC
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_BNC
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_LOG
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_LOG_CBEEBIES
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_LOG_CBBC
VanHeuven-2014-Frequency ENGLISH_FREQUENCY_LOG_BNC
VanHeuven-2014-Frequency ENGLISH_CD
VanHeuven-2014-Frequency ENGLISH_CD_CBEEBIES
VanHeuven-2014-Frequency ENGLISH_CD_CBBC
xrotwang commented 2 years ago

Wait, not true for all vars. Will create a PR to fix.