Duke-Chronicle-Project / article-extraction

1 stars 0 forks source link

Implement rule-based article extraction #6

Open OlivierBinette opened 3 years ago

OlivierBinette commented 3 years ago

Following #2 and #3, investigate the use of layout rules to reconstruct articles or portion of articles.

BrandonBae commented 3 years ago

Following up on the title extraction #5 , I implemented @OlivierBinette suggestion to add partID's based on a title+text block. As of now this is still dependent on extremely basic title extraction code. See pushed code for example csv and code.