boun-tabi-LMG / turkish-academic-text-harvest

MIT License
2 stars 0 forks source link

Handle missing footnotes and footnotes separated by lines #2

Closed gokceuludogan closed 12 months ago

gokceuludogan commented 1 year ago

The mark_footnotes function should be updated to handle missing footnotes and footnotes separated by 2 or 3 lines. We need to find a solution to properly identify footnotes in such cases.

Examples: footnotes with missing values: 16 17 -1 19: tad_11493_136941, index: 271 footnotes separated by 2 or 3 lines: 16 17 -1 -1 18:: ortetut_50685_660118

gokceuludogan commented 1 year ago

The mark_footnotes function has been updated in commit 27af1a5 to allow for gaps up to 4 lines and one missing footnote.

The current version cannot detect individual footnotes separated by substantial bodies of text.

gokceuludogan commented 12 months ago

The function remains as it is. Detecting and eliminating individual footnotes are left for future work.