elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.74k stars 8.14k forks source link

Saved Objects API - Same interface should be available for internal user and regular user #91890

Open mattkime opened 3 years ago

mattkime commented 3 years ago

The data plugin's index pattern api requires access to saved objects. Solutions need to be able to use this interface both as a normal user and as an internal user. Currently, getScopedClient returns a SavedObjectsClientContract while createInternalRepository returns a ISavedObjectsRepository. As best I can tell we can get the same type by new SavedObjectsClient(createInternalRepository()) - it seems there should be a getInteralClient function that does this. While the code is simple the type differences seem to create a lot of confusion.

Additionally, this new method will use the default space unless otherwise specified. This has caught a number of developers by surprise. We can use this as an opportunity to default to namespaces: ['*'].

elasticmachine commented 3 years ago

Pinging @elastic/kibana-core (Team:Core)

jportner commented 3 years ago

Additionally, this new method will use the default space unless otherwise specified. This has caught a number of developers by surprise. We can use this as an opportunity to default to namespaces: ['*'].

We spoke offline, but I'll elaborate here in GH. You're referring to SavedObjectsClient#find, searching across all spaces --

I don't agree that we should default to searching across all spaces, even when instantiating the SOC with the internal repository. This sounds very risky, and is directly opposed to our implementation of Spaces, which is that consumers should transparently be scoped to a specific space. There is no analog for #get, #bulkGet, #create, etc. to operate across "all spaces". I think that consumers need to be very intentional when they want to search across all spaces.

It's unfortunate that there has been confusion, but I think we need to clear it up with better docs and guidance. CC @legrego

rudolf commented 3 years ago

I agree with it being confusing and this is not a great developer experience. However, using the internal user for saved objects is discouraged and very rarely the correct approach. So when we initially exposed these API's we only exposed the internalRepository so that the spaces and security plugin can use it to wrap the SavedObjectsClient. At the time only usage collection and reporting actually had a valid use case for using the internal user.

So I guess the internal user saved objects client is awkward enough to make you think "I must be doing something wrong" and then ask the security team to review your use case. I'm open to adding a getInteralClient API with better documentation and a big warning though.

Having said that, why would solutions need to use the data plugin with the internal user?

mattkime commented 3 years ago

There is no analog for #get, #bulkGet, #create, etc. to operate across "all spaces".

I can understand 'create' not working in a namespace agnostic manner, but its not clear to me why get and bulkGet couldn't.

So I guess the internal user saved objects client is awkward enough to make you think "I must be doing something wrong" and then ask the security team to review your use case.

Its definitely awkward but I'm not sure it rises above ordinary awkwardness of writing code. I'm not sure people are thinking "something must be wrong".

At the time only usage collection and reporting actually had a valid use case for using the internal user.

Yes, and those are the use cases I'm aware of as well.

why would solutions need to use the data plugin with the internal user?

Telemetry on index patterns.


I think there are a couple of things to keep in mind:

a) We're moving away from direct access of saved objects. b) Telemetry for saved objects is a common use case. c) Most APIs don't have a need to specify the space aside from using '*' for the telemetry case.

I'd like to avoid every API thats also a saved object wrapper from having the same namespace aware code if the problem could be solved at a lower level.

afharo commented 3 years ago

Telemetry on index patterns.

related issue #92001.

Maybe we shouldn't use Saved Objects APIs to collect telemetry, and we should use ES raw requests to the index .kibana to overcome this limitation?

jportner commented 3 years ago

I can understand 'create' not working in a namespace agnostic manner, but its not clear to me why get and bulkGet couldn't.

You can have one dashboard in the Default space with ID '123', and a different separate dashboard in the Other space with ID '123'. They are totally distinct objects but consumers don't understand the difference. Consumers see them each as an object with type: 'dashboard', id: '123', but when they are serialized to ES documents their raw IDs are dashboard:123 and other:dashboard:123, respectively.

Spaces was originally designed to keep all saved objects completely separated between spaces, so we did not consider keeping IDs unique across them. Now that we are attempting to implement Sharing Saved Objects, we're reckoning with that decision.

Maybe we shouldn't use Saved Objects APIs to collect telemetry, and we should use ES raw requests to the index .kibana to overcome this limitation?

You could, since this is a very limited, read-only use case. But keep in mind that if the Spaces plugin is disabled, there may be "invisible" saved objects that exist in non-Default spaces, that are in the index but really are not accessible by users. It's not clear to me how many users actually disable the Spaces plugin, but I would hazard a guess that number is relatively small.

pgayvallet commented 3 years ago

I'm open to adding a getInteralClient API with better documentation and a big warning though.

I'm too. I'm unsure to understand to what extent we should wire it to the factory provider and factory wrappers? Should this new getInternalClient just return new SavedObjectsClient(createInternalRepository()), or should any of our scoped client's enhancements also be applied to it?

We can use this as an opportunity to default to namespaces: ['']. Most APIs don't have a need to specify the space aside from using '' for the telemetry case.

Adding the param manually when using find seems fine to me. I'm overall +1 on https://github.com/elastic/kibana/issues/91890#issuecomment-781859729

Maybe we shouldn't use Saved Objects APIs to collect telemetry, and we should use ES raw requests to the index .kibana to overcome this limitation?

I would have said 'this is probably fine' a few weeks ago, but now that using a distinct index per SO type is on the table, I think using the SO client/API should be the way to go (of course, we would REALLY need to have the aggs PR merged at some point for that, as I don't really see how to collect telemetry data without aggregations tbh)

afharo commented 3 years ago

@jportner I'm wondering createScopedRepository is also affected by the namespace limits? It returns a repository vs. a client.

According to the docs, the extension only applies to getScopedClient, but I'd love to double-check.

jportner commented 3 years ago

The repository CRUD actions all operate in a single space context, no matter how the repository is created. The only exception is the find action, which can work across all spaces if the namespaces: ['*'] argument is used.

The only difference with the internal repository vs scoped repository, is that the scoped repository determines the space context based on the user's HTTP request (e.g., the URL that was used for the HTTP API call...). The internal repository, on the other hand, has no HTTP request associated with it, so it always operates in the Default space context.