mqAncientHistory / Lat-Epig

The Lat-Epig interface allows you to query the EDCS and save the search result in a TSV file and plot the results on a map of the Roman Empire without any prior knowledge of programming.
https://mybinder.org/v2/gh/mqAncientHistory/Lat-Epig/HEAD?urlpath=notebooks/EpigraphyScraper.ipynb
GNU General Public License v3.0
14 stars 0 forks source link

Fix commentary in the text of an inscription, <b> tag in HTML #13

Closed petrifiedvoices closed 3 years ago

petrifiedvoices commented 3 years ago

EDCS-74800023 - missing text of an inscription; double usage of <b> tag iN HTML

Although originally present on the website, scraper did not scrape the text of the inscription, just the comments (or I suspect comments overwrote the text of the inscription).

HMTL: P(ublius) Titius [3] / Primu[s sibi(?)] / et s[uis et(?)] / Primae [3] / matri Ma[3] / Primigenia[e 3] / in fr(onte) p(edes) X[3] / in ag(ro) p(edes) XI[3]

<b>comment:</b> <a href="http://www.aemiliaonline.it/reperti/stele/stele-di-publius-titus-primus" target="_blank">http://www.aemiliaonline.it/reperti/stele/stele-di-publius-titus-primus</a>

Curently as is scraped to CSV: Inscription attribute: http://www.aemiliaonline.it/reperti/stele/stele-di-publius-titus-primus \n\n http://www.aemiliaonline.it/reperti/stele/stele-di-publius-titus-primus

Desired outcome: Inscription attribute: P(ublius) Titius [3] / Primu[s sibi(?)] / et s[uis et(?)] / Primae [3] / matri Ma[3] / Primigenia[e 3] / in fr(onte) p(edes) X[3] / in ag(ro) p(edes) XI[3] Comments attribute: http://www.aemiliaonline.it/reperti/stele/stele-di-publius-titus-primus

Examples of other inscriptions with a similar problem: EDCS-27601424, EDCS-10300305, EDCS-75900072 (Total 53 inscriptions, as a result of HTML tag error)

Link to the CSVs with minimal examples (Git does not allow me to paste them here): https://github.com/sdam-au/EDCS_ETL/tree/master/output

petrifiedvoices commented 3 years ago

fixed by rewrite