FrancoisMentec / OpenCompare2

https://opencompare.org/
3 stars 4 forks source link

Wikipedia import (wrong "typing" of numeric values) #2

Open FAMILIAR-project opened 7 years ago

FAMILIAR-project commented 7 years ago

Hi,

I wanted to extract the table here: https://fr.wikipedia.org/wiki/Wikip%C3%A9dia#Versions_linguistiques I get this result: https://opencompare.org/pcm/59b6366f1ce2640a4802dfb2

Unfortunately, cell values are strangely handled: the value "+0001 898 626," is a string, instead of being simply a numerical value "1 898 626".
I understand that spaces between number challenge the extraction, but I think we can do better.

As a workaround, we cannot easily change the "type" of the whole column to make explicit the choice. For instance, I would like to state that "Nombre d'articles" is an integer column, and then it can help to force the parsing/cast of strings into an integer.

Informations : source: https://opencompare.org/