boun-tabi-LMG / turkish-academic-text-harvest

MIT License
2 stars 0 forks source link

Evaluate extractor #15

Closed furkanakkurt1335 closed 12 months ago

furkanakkurt1335 commented 1 year ago

After all the steps, we need to finalize the extractor script by evaluating it on several outputs before starting it on all the PDFs.

furkanakkurt1335 commented 1 year ago

2 points I have right now for the script output:

furkanakkurt1335 commented 1 year ago

started a dictionary by e7a529b.

furkanakkurt1335 commented 12 months ago

@zeynepyirmibes had handled the above-mentioned dictionary with replacement_dict in /normalize.py.

extractor.py had been used for yok-tez and dergipark. We were happy with the outputs of the script at the end.