allenai / allennlp-demo

Code for the AllenNLP demo.
https://demo.allennlp.org
Apache License 2.0
195 stars 79 forks source link

Deliver a new AllenNLP demo (Tugboat). #700

Closed jonborchardt closed 3 years ago

jonborchardt commented 3 years ago

External Doc:

one'ish pager

Motivation:

AllenNLP is our most popular demo, but the codebase is hard to extend. We'd like to rebuild the UI to better serve AllenNLP, but with a general approach that we could also apply to Mosaic KG, the PRIOR demos, and other collections of demos.

Success Criteria:

Updated Estimate:

We have removed some of the tasks, so this is what is remaining for q1 There are 13 demos in total, of which 2 are very hard, 3 are hard and 7 are easy, which means we expect this to take 4-5 more weeks.

Related to:

schmmd commented 3 years ago

Old issue: https://github.com/allenai/reviz/issues/172

codeviking commented 3 years ago

Today we ran into a bit of an issue related to the integration of model cards into the UI.

The AllenNLP platform team has expressed a desire to migrate away from model ids that don't include the task. For instance, instead of using an id like bidaf, the id will be rc-bidaf, where rc indicates the task that the model was trained for.

This change was already made in the API that returns information about model cards. This means we can't lookup the model card for a model, unless we know the id in the new format. Some models in the demo already have this, in a field called pretrained_model_id, but others (like the NMN model) do not.

I put some duct-tape in place today to unblock @jonborchardt, but neither of us think we should maintain this solution, as we'll just be doing more of this as we convert additional tasks, or as we add new, similar models to the demo in the future.

Instead we'd like to migrate the demo to use the new id format, though this introduces some unexpected work.

The existing endpoints are of the form /api/:model-id, where :model-id is of the old form (i.e. bidaf). These URLs are hard-coded throughout the existing demo code, which means it'd be hard to update them in a single pass without breaking things (as it's a lot to test!). Instead we'd like to do a more iterative migration.

  1. We'll add a new (v1) API for producing the Kubernetes manifests for an application, that takes an additional parameter for specifying a list of alternative paths to be routed to the endpoint. This new API will use the /api/model prefix instead of /api.

    common.v1.APIEndpoint(model.id, altPaths=['/api/bidaf'])
  2. We'll port models incrementally to the new API. As we do so support will be added for the new urls (/api/model/rc-bidaf) while retaining support for the old one (/api/bidaf).

    {
    - id: 'bidaf'
    + id: 'rc-bidaf'
    }

    Since the id is changing, every deployment will create a new Kubernetes deployment. So we'll need to make sure to delete the old ones as this incremental migration occurs.

  3. We'll port the Reading Comprehension endpoints first, after which we can modify the new UI code to use the new ID format. At this point I'll be able to remove the hack mentioned above.

  4. We'll port the endpoints for the other tasks and models to the new mechanism.

  5. Once all of them are ported, we can remove the old API for configuring Kubernetes things, so that only v1 exists.

  6. Once all UI code is ported to the new mechanism, we can remove the altPaths if we want, thereby deprecating the old URL format.

As part of this I'd also like to do a little cleanup. Right now each endpoint returns some information when querying/api/:model-id. For instance, hitting /api/bidaf returns some basic information about that model. This information isn't used for anything except for the main info route.

I'd like to change this information to be returned when hitting /api/:model-id/info instead, so that /api/:model-id can be reserved for other behavior. This is important for the /tasks endpoint that I added recently, as the/tasks endpoint right now returns the info, which means the list of tasks is returned by /api/tasks/all, which is a little silly. If we make this change then /api/tasks can return all of the tasks in the demo, which is less surprising.

Also, it's worth noting that I don't plan on migrating the python packages to match the new id syntax. So api/allennlp_demo/bidaf will remain unchanged.

jonborchardt commented 3 years ago

this makes sense!

codeviking commented 3 years ago

I think we need to do this too: https://gist.github.com/codeviking/e69df59f664380276a21efec8337478e