elastic / connectors

Source code for all Elastic connectors, developed by the Search team at Elastic, and home of our Python connector development framework
https://www.elastic.co/guide/en/enterprise-search/master/index.html
Other
74 stars 128 forks source link

Doc improvements for Python connectors #360

Closed ppf2 closed 1 year ago

ppf2 commented 1 year ago

General doc feedback on https://github.com/elastic/connectors-python/blob/main/docs/DEVELOPING.md guide:

  1. The pre-requisite and building information are scattered around the page (in the middle and at the end). I think it’s better to be consistent with the Ruby guide https://github.com/elastic/connectors-ruby/blob/main/docs/DEVELOPING.md where we put such information right at the top of the page.
  2. Right now we are embedding network drives and S3 connector docs within this developing page. It will be cleaner if we separate them out into their own configuration pages. We can always cross reference them from the DEVELOPING page.
  3. The page doesn’t mention the make run command that is mentioned in a different location https://github.com/elastic/connectors-python/blob/main/docs/CONFIG.md#run-the-connector-service-for-a-custom-connector . Instead, it recommends using bin/elastic-ingest to run the service. On the Ruby side, we use make run. It will be helpful to be consistent.
  4. The page doesn’t provide what the construct looks like when defining the configuration for say Amazon S3 connector. For example, it mentions that buckets is a configuration for S3 connector but doesn’t provide an example of whether this is a setting for the root level of the configuration or if it needs to be specified under some S3 clause, etc.. It appears that a config screen will appear on the UI once the connector service is successfully connected to Enterprise Search so it does not appear to be a configuration done in the yml. It will be helpful to clarify this for users.
  5. The developers guide does not provide any information on general configuration settings in config.yml. For example, it does not mention that elasticsearch.host requires the full scheme, host and port in its value. Likewise, it doesn’t specify where to put the Elasticsearch API key in the config.yml as part of “build a connector” from the UI. It will be helpful to cross link https://github.com/elastic/connectors-ruby/blob/main/docs/CONFIG.md from the developers guide.
ppf2 commented 1 year ago
  1. When we mention Python 3.10+ as pre-requisite, it will be helpful to also add a note about the System/Volumes/Data/Applications/Python\ 3.10/Install\ Certificates.command command users may have to run in order to connect with S3. Ref: https://github.com/elastic/connectors-python/issues/359#issuecomment-1397753688
parth-crest commented 1 year ago
  1. The pre-requisite and building information are scattered around the page (in the middle and at the end). I think it’s better to be consistent with the Ruby guide https://github.com/elastic/connectors-ruby/blob/main/docs/DEVELOPING.md where we put such information right at the top of the page.

For the prerequisites, the only common prerequisite is installing python3.10. All the other prerequisites are connector specific, so we can not move it at the top and make it generic. Also as @tarekziade mentioned in the PR he will be adding the Installation section for installing python3.10 for Linux and Mac that would solve the #366 issue as well.

  1. Right now we are embedding network drives and S3 connector docs within this developing page. It will be cleaner if we separate them out into their own configuration pages. We can always cross reference them from the DEVELOPING page.

As Dana mentioned in the issue here https://github.com/elastic/connectors-python/issues/260#issuecomment-1375890421 we have kept all the connectors details in the same file. We can update anytime in the future when we want to split them in different connector.md file.

  1. The page doesn’t provide what the construct looks like when defining the configuration for say Amazon S3 connector. For example, it mentions that buckets is a configuration for S3 connector but doesn’t provide an example of whether this is a setting for the root level of the configuration or if it needs to be specified under some S3 clause, etc.. It appears that a config screen will appear on the UI once the connector service is successfully connected to Enterprise Search so it does not appear to be a configuration done in the yml. It will be helpful to clarify this for users.

Since Network drive and S3 are not the native connector from the UI, For the first sync it would take the configuration from get_default_configuration() method in the s3.py and network_drive.py file for the respective connector. We can update configuration after the 1st sync(in case of recurrening sync) from the UI. Please elaborate more for this point if I am not able to answer your question.

  1. When we mention Python 3.10+ as pre-requisite, it will be helpful to also add a note about the System/Volumes/Data/Applications/Python\ 3.10/Install\ Certificates.command command users may have to run in order to connect with S3.

As mentioned by Tarek here, He will take care of this issue.

For the point 3 and 5 we will raise PR soon

ppf2 commented 1 year ago

Since Network drive and S3 are not the native connector from the UI, For the first sync it would take the configuration from get_default_configuration() method in the s3.py and network_drive.py file for the respective connector. We can update configuration after the 1st sync(in case of recurrening sync) from the UI. Please elaborate more for this point if I am not able to answer your question.

Yeah, that's the part that is ambiguous. On first sync, where do users specify the configuration and in what format? if it's the config.yml, it will be helpful to provide an example so users will know if the configuration needs to go into the root level of the yml, nested under a s3: element or something else, etc.. It will be helpful to mention the UI editor appearing at some point, etc..

akanshi-crest commented 1 year ago

On first sync, user will specify the configuration in get_default_configuration method under s3.py file. And for this we already added a section with examples in developing.md file(https://github.com/elastic/connectors-python/blob/main/docs/DEVELOPING.md#configure-amazon-s3-connector)

ppf2 commented 1 year ago

Thanks! Do we have anything on the roadmap @danajuratoni to improve the first sync experience before the promotion to a native connector? It's uncommon for users to have to update code file to set up the initial configuration.

tarekziade commented 1 year ago

I think there’s a misunderstanding. If you use those connectors you don’t have to change the yaml file or the code. Once it’s started, the connector will send the default config to kibana and you will get a form in kibana with pre filled default values you can change.

@ppf2 is there anything else you are thinking about ?

tarekziade commented 1 year ago

IOW the only thing you do in the config file is set the api key and the connector id, everything else is automated

ppf2 commented 1 year ago

Yeah, what I mean is that the doc can be more clear because it simply lists the available configuration settings for the S3 connector, which makes the user wonder where to put them, in the config.yml? If so, in what format? Certainly, i found out from testing that once the connection can be made between the connector service and Enterprise Search, the UI will display the editor for configuration. So the only thing we really need to do here is to document that those S3 configurations we mention in the doc are available on the UI once a successful connection can be made (i.e. it's not something they need to pre-configure in the yml, etc..). Hope this makes sense 😄

danajuratoni commented 1 year ago

i found out from testing that once the connection can be made between the connector service and Enterprise Search, the UI will display the editor for configuration

We have it on our radar to write a developer documentation for detailing the flow and addressing points you mentioned.

For the point 3 and 5 we will raise PR soon @parth-crest Is there any pending work on these points? I remember seeing these fixes in a PR some time ago.

If these fixes are merged, I think we can close this ticket.

praveen-elastic commented 1 year ago

Hey @ppf2, since the PRs got merged, I guess it's safe to close this issue now :)

ppf2 commented 1 year ago

Thanks! I will watch for the developer documentation when it is available in the future :)