alephdata / memorious

Lightweight web scraping toolkit for documents and structured data.
https://docs.alephdata.org/developers/memorious
MIT License
310 stars 59 forks source link

Update the parse function to accept an entity id #189

Closed Rosencrantz closed 1 year ago

Rosencrantz commented 3 years ago

This includes a tweak to the parse function so that it generates an entity id before creating an entity. There are two ways in which this can occur

  1. Supply a list of xpath values that will be concatenated together and hashed in order to generate a unique key
  2. Do nothing and allow the parse function to automatically generate a key based on the url of the page that is being parsed.
sunu commented 2 years ago

Could you fix the merge conflict too? I think it's from the changes Simon made to fix a couple of issues in aleph_emit operation.

Rosencrantz commented 1 year ago

This died