Add an API endpoint to upload pre-created analyzer results

sschuberth commented 6 months ago

Instead of running the analyzer on the ORT server, it is sometimes necessary to run the analyzer on-premises / locally (to avoid cloning the project in the cloud). For such a use-case the ORT server should have an endpoint to upload an existing analyzer result.

sschuberth commented 5 months ago

@heliocastro, can you maybe share the API specs of your Optima service here for reference?

heliocastro commented 5 months ago

I will be doing this on next Monday, need clear internal naming endpoints.

mmurto commented 2 months ago

As discussed in #907, it could make sense to have the API work in a way that an existing analyzer result is uploaded to a specific product, and uploading it creates a new repository in that product. Users could then specify whatever run arguments for runs against that repo, and the runs would then be executed with the same analyzer result as the base without the need to upload the result file for each individual run. There should also be an update endpoint to upload a new analyzer result to overwrite the previously uploaded one, leading to the runs after that using the new analyzer result as the base.

sschuberth commented 2 months ago

I will be doing this on next Monday, need clear internal naming endpoints.

Any progress here BTW, @heliocastro?

mnonnenmacher commented 2 months ago

As discussed in #907, it could make sense to have the API work in a way that an existing analyzer result is uploaded to a specific product, and uploading it creates a new repository in that product. Users could then specify whatever run arguments for runs against that repo, and the runs would then be executed with the same analyzer result as the base without the need to upload the result file for each individual run. There should also be an update endpoint to upload a new analyzer result to overwrite the previously uploaded one, leading to the runs after that using the new analyzer result as the base.

This brings up some questions, how to handle the repository information in the uploaded ORT result:

If the ORT result contains a repository provenance, this should match the URL of the repository in the DB.
What if the ORT result has no provenance information? Currently the repository URL is a mandatory field, if there is no repository URL, we would need another column to identify the repository and to show something else than the ID in the UI.
If there is no provenance information, do we have to ensure somehow that uploaded analyzer results for the same repository match?
Should we even allow ORT results without provenance information? If we do, automatic repository creation would be problematic.

mmurto commented 2 months ago

What if the ORT result has no provenance information? Currently the repository URL is a mandatory field, if there is no repository URL, we would need another column to identify the repository and to show something else than the ID in the UI.

If using provided analyzer results requires a lot of work arounds, we could consider creating separate database entries for them, similar to repos.

If there is no provenance information, do we have to ensure somehow that uploaded analyzer results for the same repository match?

Maybe a warning if the user is uploading a result with different information that the user than then discard and upload if they so wish.

Should we even allow ORT results without provenance information? If we do, automatic repository creation would be problematic.

I feel not allowing ORT results without provenance information would be quite arbitrary limitation if ORT itself can work on them. I think there's nothing that the analysis really needs provenance information for the source, except for scanning the project, for which an ORT issue should suffice.

sschuberth commented 2 months ago

I feel not allowing ORT results without provenance information would be quite arbitrary limitation if ORT itself can work on them.

It can't, see the related https://github.com/oss-review-toolkit/ort/issues/8803.

mnonnenmacher commented 2 months ago

I feel not allowing ORT results without provenance information would be quite arbitrary limitation if ORT itself can work on them.

It can't, see the related oss-review-toolkit/ort#8803.

But that might change so we should still consider it when designing the upload functionality.

sschuberth commented 1 month ago

Ideas from today's meeting with @mnonnenmacher:

Create a new endpoint that takes an analyzer result file upload and streams it e.g. to S3 storage
Add a new mode to the analyzer worker that instead of running the analyzer will download the previously uploaded analyzer result
UX-wise uploading an analyzer results could be handled as a special "Repository" type that you configure once, and that cannot be changes later, i.e. it would not be supported to have runs from existing analyzer results and regular analyzer runs for the same "Repository".

eclipse-apoapsis / ort-server

Add an API endpoint to upload pre-created analyzer results #193