Wikipedia Gender Index (WIGI), uses Wikidata to produce gender-related statistic on Wikipedia Biographies
\end{small}. As a design decision we do not translate these Wikidata Q-IDs, to maintain language neutrality. We do however include functions to translate these Q-IDs into English (or any other language), which would render the above row as: \ \begin{small} Aung San Suu Kyi,1945,,female|,,Myanmar|,Yangon|,,politician|writer|human rights activist| \end{small}
In order to faithfully represent Wikidata, the value of each property is actually a list, since Wikidata allows there to potentially be multiple values for a property. This is because either two sources disagree on a property, or like in the case of Aung San Suu Kyi, she has many occupations, see Figure \ref{fig:aung}. We store the list, inside the comma-separated sheet, as | ``pipe''-separated values.
Of course these multiple values introduce a design problem in aggregating on a list of properties. Our method is to aggregate on the list, rather than on the individual items within the list. This means in the case of Aung San Suu Kyi, that her occupation is stored as politician, writer, and human rights activist, and is aggregated with all the other humans who have those three occupations too. Since the dataset is open, interested researchers can use our raw data and aggregate it in any way they want.