Open ahalterman opened 8 years ago
This already exists at https://github.com/openeventdata/petrarch2/blob/master/petrarch2/data/config/PETR_config.ini. Additionally, the deployed version of the pipeline has always specified a separate config file for PETR with the dictionaries living in a top-level directory. You can see how you'd hit that at https://github.com/openeventdata/petrarch2/blob/master/petrarch2/petrarch2.py#L530. The built in dictionaries are included so people can download and run as-is without having to hunt for dictionaries as well.
If this isn't clear in the documentation/user guides then it should be called out in greater detail. In other words, I think this is a docs issue rather than a feature issue.
But there's a bigger issue here that goes way beyond just the dictionaries: there's quite a bit of CAMEO-specific code in Petrarch-2, e.g. the fairly complex system by which certain word combinations modify the event code, then the conversion of the internal representation of the code to CAMEO via the utilities.convert_code()
function. This was not the case in Petrarch-1, TABARI or KEDS, but was the case in the VRA-Reader and, as far as I know, ICEWS/Accent. Which is to say, the problem has been solved both ways.
My inclination (obviously...) is that the coder should be completely neutral to the coding system, but that's not what we've got at the moment in Petrarch-2, and getting there would require an extensive rewrite, and probably would best be done by adding some sort of macro facility to the dictionaries . That would be a "good thing" (TM) and TABARI came pretty close to having this, though it was never really used in working dictionaries, and Petrarch-2 already has a limited version as well.
Possibly the place to figure this out, however, is in deciding how the dependency-based version of the program is going to work.
Petrarch2's code should be distinct from the dictionaries it uses. To make changes to the dictionaries more visible and to make it easier to switch in custom dictionaries, take the built-in dictionaries out of Petrarch2 and have it look for them at a location defined in the config file. Move the built-in dictionaries to the Dictionaries repo.
This should help clear up questions like #19.