neulab / explainaboard_web

MIT License
8 stars 2 forks source link

Add checks for building the Python client in CI #363

Open PaulCCCCCCH opened 2 years ago

PaulCCCCCCH commented 2 years ago

Currently CI only builds and publishes the Python client when pushing to the main branch, and we have no checks on feature branches. Thus, some errors in openapi.yaml only surfaces when merging a PR (e.g., passes all checks on this branch, but fails on main).

Would be nice if we add checks for building the Python client on feature branches.

tetsuok commented 2 years ago

@lyuyangh I think #372 fixes this issue. Could you confirm?

lyuyangh commented 2 years ago

@tetsuok Thanks! This is something different. This refers to the py client we generate in py-client-release.yml. The py client is released so our users can interact with explainaboard_web programmatically and the release workflow is triggered on push to main. This is problematic because we won't know the generated py client has an issue until we merge to main. I think we can run codegen for py client in CI so that we know the code change doesn't break the py client.

tetsuok commented 2 years ago

@lyuyangh I see. Thanks for the clarification!

tetsuok commented 2 years ago

@lyuyangh I have another question. Does the Python client directly interact with the backend server?

lyuyangh commented 2 years ago

@tetsuok Yes.

The workflow releases a package called explainaboard_api_client which provides all the models and a bunch of functions to make HTTP requests. This package is not intended to be used by the users directly though. We have another thin wrapper explainaboard_client to help users choose different environments, authenticate and encode system output files, etc. It also provides a CLI. The idea is explainaboard_client doesn't change that often. If we release a new version of the API, the user only needs to pip install -U explainaboard_api_client. And that package is theoretically guaranteed to be compatible with the new version of the API because it's generated based on openapi.yaml.

tetsuok commented 2 years ago

@lyuyangh Thanks for the detailed explanation. That's useful for me to understand the current surrounding ecosystem. If I understand correctly, the backend server is used by three users: the frontend, explainaboard_api_client, and explainaboard_client. I think the backend server needs to be used by the frontend only. If we really want to provide Web APIs to external users, we need a API server. The reasons are as follows:

  1. The backend server has a dual responsibility: as a backend server for the frontend and as an API server for external clients. It's better to design a server such that it has a single clear responsibility. Exposing the functionality (including APIs for internal use) of the backend server to external users becomes harder to make changes only for frontend because we need to always care about the effect on external users. Also, doing re-architecture of the backend server becomes very difficult.
  2. For security reasons. Some malicious users can attack the backend server through explainaboard_api_client, and explainaboard_client. This means it's possible to make the web site unavailable.

cc: @odashi

lyuyangh commented 2 years ago

@tetsuok Thanks for the comment! I am not sure if I understand your concerns correctly. I think when we say backend, we are referring to different things.

explainaboard_web has two parts: the web frontend (react app that runs in the browser) and a backend. The backend (flask app) = business logic + API server. The API server exposes a set of endpoints/APIs. The front end sends HTTP requests to these APIs to talk to the backend. These APIs have access control but they are public so anyone can send HTTP requests to them. We don't have any APIs for internal use only. If we do in the future, those won't be public. There is a swagger UI page for these APIs at https://explainaboard.inspiredco.ai/api/ui (it is down for some reason... I'll need to fix it).

explainaboard_client sends requests to the same set of APIs to talk to the backend. To the backend, the web frontend is just another client. It is the same thing as the py clients.

I think this is safe. If we want to add an additional layer between the clients and the backend for load balancing, rate limiting, etc, I think the web front end needs to go through that layer, too.

odashi commented 2 years ago

I read the thread now.

I think the current webapp's backend is actually not a so-called "web backend": it is just the API server of the provided functionalities. I think this is reasonablly lightweight at this point.

When we continue development, the "web backend" is required to implement many stuff related to only the frontend, and discrepancy of requirements between the webapp and direct API calls usually increases. As a result, coupling these functionality into the same server makes maintenance hard.

If we want to add an additional layer between the clients and the backend for load balancing, rate limiting, etc, I think the web front end needs to go through that layer, too.

We need to control traffic by the internal management and the Web frontend should be agnostic from this.

odashi commented 2 years ago

tes drawio (3)