hypothesis / product-backlog

Where new feature ideas and current bugs for the Hypothesis product live
117 stars 7 forks source link

Determine representation of import-related annotation metadata #1511

Closed lyzadanger closed 11 months ago

lyzadanger commented 11 months ago

Backend and services: Determine representation of import-related annotation metadata and ensure API services can take and provide import metadata.

marcospri commented 11 months ago

The way H's API handles additional metadata in the API is a bit too flexible, see:

https://hypothes-is.slack.com/archives/C1MA4E9B9/p1688637604519929

Changing that it's outside the scope of this project and tricky to do in any case due to backward compatibility concerns for clients that rely on that behavior.

Due to that flexibility, we can choose between different approaches with no code changes.

IMO we have to balance two issues while picking one representation:

With that in mind, my preferred approach will be something along the lines:

POST http://localhost:5000/api/annotations
{ 
  "extra": {
      "original_id": "XXX",
      "source": "import",
  }
  ...
  "text": "..."
  "document": {...}
  "target": [...]
}

Some other examples here. Note that the representation while POSTING will be mirrored while GETTING (and searching etc).

The representation in the DB will also match this structure.

Nested structure inside a general metadata container

POST http://localhost:5000/api/annotations
{ 
  "extra": {
    "import": {
      "original_id": "XXX",
    }
  }  
  ...
  "text": "..."
  "document": {...}
  "target": [...]
}

Top-level fields

POST http://localhost:5000/api/annotations
{ 
  "source": "import",
  "original_id": "XXX",
  ...
  "text": "..."
  "document": {...}
  "target": [...]
}

Nested structure

POST http://localhost:5000/api/annotations
{ 
  "import": {
      "original_id": "XXX",
  }
  ...
  "text": "..."
  "document": {...}
  "target": [...]
}
acelaya commented 11 months ago

I'm personally ok with using your preferred approach.

The import_id is already a "piece of text" by design for other reasons, so I think it should be ok to put it inside extra for now..

marcospri commented 11 months ago

Going with the proposal above, modulo any naming changes made during implementation to be reviewed in PRs.

POST http://localhost:5000/api/annotations
{ 
  "extra": {
      "original_id": "XXX",
      "source": "import",
  }
  ...
  "text": "..."
  "document": {...}
  "target": [...]
}