Open hargup opened 9 years ago
@notconfusing some of the files you have used in the notebooks like
helpers/world_cultures_shortcut.json
and helpers/wiki_code_map.json
are not present in this repository. Can you please add them?
I'm creating a basic python package for WIGI
at https://github.com/notconfusing/WIGI/tree/hargup/refactoring. My current approach is to move recurrent pieces of code to the package, and then that code from the package to reproduce the notebook. I would like to completely decouple data retrieval, data processing and data presentation.
@hargup fantastic plan on decoupling all the seperate stages.
\me inhales deeply. OK, snapshot_data
comes from this Java program. https://github.com/notconfusing/WIGI/blob/master/GenderIndexProcessor.java It's the thing we will have to run every week. In order to run it you need Wikidata Toolkit (WDTK). I want to get this happening on Wikimedia Labs because the ~2GB wikidata dump that it needs would be available over the local network rather than a big download. However if it helps you can just run WDTK locally for now.
BTW, When you say "package" do you mean making a "pip" package?
Yes, when I say package I mean standalone software which can installed using pip
or other package managers.
As per #8, first focus on Gender by Culture, Gender by Country (World Map), Gender by Date of Birth, and Wikipedia Language by Gender.
I've created on big python script which is gender-index-processing-standalone.py
that makes the graphable csv's. So I'm not sure how this affects making a pip package, or refactoring. We don't really need the ipynb's except for demonstration purposes, so I'm going to move this to phase D.
@notconfusing can you brief me about how you have generated the
snapshot_data
. I should be able to write a script to generate them at regular intervals.