Closed wricketts closed 7 months ago
As of the completion of #60 and #63, this project only requires an update to the contents of and code for process.py plus an explanation.
Only item remaining is to
``### [
process.py](process.py) _TODO: Add detail to summary._ _TODO: get rid of
goldretriever.pyand change
process.pyto always read in local "raw" files_ Please see the docstring of [
process.py`](process.py).
The wikidata QID link has been automatically appended to the previous columns by process.py
.
The gold generation process automatically adds the QID. ```
TODO: get rid of
goldretriever.py
and changeprocess.py
to always read in local "raw" files
It looks like the goldretriever is there to download the gold files from NER project that are required for NEL gold generation? I don't think we can remove that part of the code.
Ok, to clarify, this seems to be something you suggested in one of the edits. Should it just be summarized instead? What should the raws be here?
2 months ago
[some small edits](https://github.com/clamsproject/aapb-annotations/commit/c512dfbc61b41850d0b3d093b36b57ef50586d46)
_TODO: get rid og goldretriver and change the proc.py to alway read in local "raw" files_
Because
The 2022 December NEL project has an incomplete
README.md
file. As stated in the main repository README file, we want each subdirectory to contain its ownREADME.md
detailing annotation project-specific information (e.g. project name, annotator demographics, annotation environment information, gold generation code dependencies, etc.).For this project in particular, the input was from the
newshour-namedentity
annotation project.Done when
Additional context
Issue https://github.com/clamsproject/aapb-annotations/issues/26 should probably be addressed first since it directly impacts this project.