Edit NEL process.py (orig:add README.md)

clamsproject / aapb-annotations

Repository to store manual annotation dataset developed for CLAMS-AAPB collaboration

3 stars 0 forks source link

Edit NEL process.py (orig:add README.md) #25

Closed wricketts closed 7 months ago

wricketts commented 1 year ago

Because

The 2022 December NEL project has an incomplete README.md file. As stated in the main repository README file, we want each subdirectory to contain its own README.md detailing annotation project-specific information (e.g. project name, annotator demographics, annotation environment information, gold generation code dependencies, etc.).

For this project in particular, the input was from the newshour-namedentity annotation project.

Done when

[x] The README.md is updated with relevant information.

Additional context

Issue https://github.com/clamsproject/aapb-annotations/issues/26 should probably be addressed first since it directly impacts this project.

jarumihooi commented 9 months ago

As of the completion of #60 and #63, this project only requires an update to the contents of and code for process.py plus an explanation.

jarumihooi commented 8 months ago

Only item remaining is to

[ ] update process.py and update readme.

``### [process.py](process.py) _TODO: Add detail to summary._ _TODO: get rid ofgoldretriever.pyand changeprocess.pyto always read in local "raw" files_ Please see the docstring of [process.py`](process.py).

The wikidata QID link has been automatically appended to the previous columns by process.py. The gold generation process automatically adds the QID. ```

keighrim commented 7 months ago

TODO: get rid of goldretriever.py and change process.py to always read in local "raw" files

It looks like the goldretriever is there to download the gold files from NER project that are required for NEL gold generation? I don't think we can remove that part of the code.

jarumihooi commented 7 months ago

Ok, to clarify, this seems to be something you suggested in one of the edits. Should it just be summarized instead? What should the raws be here?

2 months ago

[some small edits](https://github.com/clamsproject/aapb-annotations/commit/c512dfbc61b41850d0b3d093b36b57ef50586d46)
_TODO: get rid og goldretriver and change the proc.py to alway read in local "raw" files_