Rethink the design of explainaboard_web, explainaboard_api_client, explainaboard_client

pfliu-nlp commented 2 years ago

Currently, we have

(1) explainaboard_web -> explainaboard_api_client -> explainaboard_client

@neubig I'm curious about what the limitations are if we have

(2) explainaboard_web -> explainaboard_client

(considering that we have already had the different types of APIs from explainaboard_web)

To achieve (2): part of implementation of evaluate_system() https://github.com/neulab/explainaboard_client/blob/40b17c8d3808d3d305488a4094efc97f7beee82b/explainaboard_client/client.py#L46 could be moved into explainaboard_web, which could be further utilized by developers who are customized to other languages (e.g., js)

neubig commented 2 years ago

I think @lyuyangh is probably a better person to answer this question, as he came up with the design originally.

lyuyangh commented 2 years ago

@pfliu-nlp Thanks for bringing this up! I think the overall design of explainaboard_client can definitely be improved and we are not taking full advantage of explainaboard_api_client at the moment. Here are my answers to your questions. I also provided some more background information to explain my thought process which may be slightly off-topic. Also, a lot of it is just my personal opinions so we can go in some other directions too.

Terminology: (just to make sure we are referring to the same thing)

backend: API server. + business logic; basically everything under explainaboard_web/backend
API: API endpoints exposed by the backend; not the interface of the functions of some python function; not the interface of the SDK either
OpenAPI: a tool we use to declare and document APIs; the API definitions are in openapi.yaml
Swagger UI: a tool we use to visualize and publish our APIs. OpenAPI used to be called Swagger. That's why it is called Swagger UI (it's what it is called by the tool we are using).
1. In theory, we don't need explainaboard_api_client or explainaboard_client.
2. Our APIs are public so users can read the Swagger UI and send requests to our backend. They can use explainaboard_web that way and use whatever HTTP client they like (e.g. py requests, js fetch, curl, postman).
3. We haven't found the need to have private APIs but if we do, it won't affect users' ability to use the public APIs.
4. Whether the APIs are clearly documented and easy to use is a different discussion. I think they are but I am obviously biased because I designed some of them.

explainaboard_api_client is useful because:
1. It is easier to use explainaboard_api_client than to figure out what requests to send by reading Swagger UI. If you use curl to send requests and you misspelled a parameter name, it is a silent error. With explainaboard_api_client, that won't happen because the parameter names are provided by the python package.
2. You also get type hints with explainaboard_api_client. The type hints are not good but they are better than nothing (?).
```
# Example 1
def systems_get(self, **kwargs):
"""Returns a list of systems  # noqa: E501

Keyword Args:
    system_name (str): fuzzy match for system name. [optional]
    task (str): filter by task type. [optional]
    ...
"""
# It should be able to return a `System` object that has all the fields as
# attributes. Unfortunately, the python client template we use doesn't support
# that. We can modify the template to support it if we want.
```
```
# Example 2: when we submit a system (`SystemCreateProps` is from `explainaboard_api_client`)
props_with_loaded_file = SystemCreateProps(
    metadata=metadata,
    system_output=loaded_system_output,
    custom_dataset=loaded_custom_dataset,
)
```
3. It is autogenerated so it is guaranteed to match the APIs. Whenever we change openapi.yaml, we are technically releasing a new version of the API. We should try to make APIs backward compatible (especially if we have more users) but still, we may add new parameters or APIs.
We still decided to provide explainaboard_client for the users because:
1. There are two things that are difficult to do if you want to use the APIs directly: login and uploading files.
  - To log in, you need to specify the API Key and email. It is inconvenient for the user if they have to do it every time they make a request. explainaboard_client can cache that information. I don't think we do that right now but I think this is probably a useful thing to do.
  - To upload files, you need to load the file in memory, encode it in base 64 and then upload. We do all three steps in explainaboard_client so our users don't have to do it.
2. explainaboard_client is a thin wrapper for explainaboard_api_client and the original goal is just to handle the two things mentioned in 3.1. Additionally, explainaboard_client also provides a CLI for the users. I think the CLI can be more powerful but that is another discussion. For example, it can make requests while it is interacting with the users so it can provide a dropdown for task names so users can just choose from the list.
3. Because of 3.2, we don't need to update explainaboard_client that often. If we release a new version of the API, users just need to pip install -U explainaboard_api_client. We get that for free because it is auto-generated.
  - For example, if we want to add an is_mine parameter to GET /systems to get systems created by the current users only. We don't need to update explainaboard_client at all. It will automatically work when the user upgrades their explainaboard_api_client. Similarly, if a new API is added, it will automatically work.
  - @neubig I noticed that explainaboard_client no longer supports this. I made ExplainaboardClient a subclass of DefaultApi originally because I want to make the code forward compatible. But I do think the new interface makes it clear what arguments are available for each method so I am ok with this direction.
  - This is the main reason I decided to go with this design. If we don't plan to support this anymore, I think explainaboard_api_client may still be useful for the reasons mentioned 2.2. But again, the python client template we use right now only provides limited type hints. So, if you think the benefit it brings does not justify the additional complexity added to the system, we can get rid of explainaboard_api_client.

Could you please clarify your second point? I think evaluate_system() only makes one API call. The rest is just loading the file (3.1) and assigning the default values. If we want a JS client, I think all this logic needs to be reimplemented for that client too.

neulab / explainaboard_client

Rethink the design of explainaboard_web, explainaboard_api_client, explainaboard_client #41

(1) explainaboard_web -> explainaboard_api_client -> explainaboard_client

(2) explainaboard_web -> explainaboard_client

Swagger UI: a tool we use to visualize and publish our APIs. OpenAPI used to be called Swagger. That's why it is called Swagger UI (it's what it is called by the tool we are using).