matsduf commented 1 year ago

For batches there are three API methods (the equivalent experimental API in parenthesis):

add_api_user (user_create)
add_batch_job (batch_create)
get_batch_job_result (batch_status)

add_api_user (user_create)

This method is problematic and by default disabled. It is hard to see the use case of multiple users for batches. I suggest that this method is removed. At the same time, the username and API key concept should be removed.

Instead, introduce a token to be used to authenticate batches. There is no need for separate users. The token is to be configured in the Backend configuration file, which removes the need to have some special interface for setting the user/key. If there is no token in the configuration file, batches are not supported.

Remove the method
Update backend to use a token in configuration file instead of user/key
Remove user table

add_batch_job (batch_create)

There are two changes for this method:

Update to use token instead of user/key
Make it possible to add custom data for each zone name entered, see #1116

get_batch_job_result (batch_status)

This method should be replaced by the following methods (method names are just draft names):

get_batch_list

Input:

Batch ID.

Output:

The number of tests in the batch.
A list of all tests in the batch, where each test is given as hash-ID and zone name pair.

This method could result in a lot of data. Maybe paging would help? In the usual case this method is only used once, and then the calling script will cache the data.

get_batch_status

Input:

Batch ID
Optionally a list of one or more of the keywords "waiting", "running" and "done"

Output:

The number ofwaiting tests in the batch (progress = 0).
The number of running tests (0 < progress < 100).
The number of completed tests (progress = 100).
If the keyword "waiting" was included, a list of hash ID of tests where progress = 0
If the keyword "running" was included, a list of hash ID of tests where 0 < progress < 100
If the keyword "done" was included, a list of hash ID of tests where progress = 100

This is a method that the calling script will use multiple times, and there is no need to repeat the hash-IDs unless really requested. This method could result in a lot of data. Maybe paging would help?

matsstralbergiis commented 1 year ago

I really like the "get_batch_status" functionality as I today get a very big response when checking the progress of testing the whole se-zone.

Would it be better to only use "get_batch_list" and have another parameter asking for zones that are "waiting", "in progress" and "finished"? Then we don't need the "get_batch_completed" function.

ghost commented 1 year ago

Just a pernickety remark, it looks like the discussion is using the current API method names. Would it be possible to use the new experimental names, so that we can see how it integrates in the new/experimental API?

matsduf commented 1 year ago

Would it be better to only use "get_batch_list" and have another parameter asking for zones that are "waiting", "in progress" and "finished"? Then we don't need the "get_batch_completed" function.

I have updated the description. Is that in line with your suggestion?

matsduf commented 1 year ago

Would it be possible to use the new experimental names, so that we can see how it integrates in the new/experimental API?

I added those names to the text.

mattias-p commented 1 year ago

Access control

I really like the idea to remove user_create. Having the access control configured in backend_config.ini seems like a very reasonable approach.

An alternative approach I'd like to voice is to simply remove access control from batch_create. As an administrator you can already configure a public-facing zm-rpcapi with batch_create disabled and a different zm-rpcapi with batch_create enabled that is only accessible from a restricted network.

matsduf commented 1 year ago

An alternative approach I'd like to voice is to simply remove access control from batch_create. As an administrator you can already configure a public-facing zm-rpcapi with batch_create disabled and a different zm-rpcapi with batch_create enabled that is only accessible from a restricted network.

That could also work. From an implementation and code point of view it would probably be easier. From a user perspective it adds complexity to run several RPC API daemons.

I happy that I am not the only one disliking user_create.

matsduf commented 1 year ago

Today only add_batch_job requires authorization to run, which is probably the most important API method to control. I think we should put the suggested add_batch_list and add_batch_status under the same control. Both to prevent someone from the outside to see what batches are run and also prevent someone from the outside to create high load on the servers by requesting status for large batches.

One simple solution is to do as @mattias-p suggests above to control those by having those disabled by default and suggesting that a separate rpcapi is set up, which is only available locally. If we combine this by a configuration document for using the batch function I think this could work well. Then no token or user and password is not needed.

All external calls to the backend passes the Apache reverse proxy in a default setup. I would expect that it is possible to filter in that reverse proxy so that only API methods "approved" can pass through. If so, that could be the default, and no extra rpcapi daemon would have to be set up.

zonemaster / zonemaster-backend

Updated API methods for batches #1119

add_api_user (user_create)

add_batch_job (batch_create)

get_batch_job_result (batch_status)

get_batch_list

get_batch_status

Access control