hashicorp-forge / grove

A Software as a Service (SaaS) log collection framework.
https://hashicorp-forge.github.io/grove/
Mozilla Public License 2.0
131 stars 13 forks source link

No value found in cache #21

Closed bojiang1990 closed 1 year ago

bojiang1990 commented 1 year ago

Hi guys, I am trying to pull logs from Slack using the provided example, I have the bot token ready, but I have no idea what I should put in the field of "identifier". After trying many combinations, it always pops up message saying: No value found in cache, and Connector was unable to collect logs. BTW, is token the only accepted credential in pulling logs?

hcpadkins commented 1 year ago

Hey there,

Thanks for raising this issue, and for your interest in Grove!

The identity for a connector is always required, even if the API only has a single "factor" of authentication supported as this field is used internally by Grove. If the authentication method uses a bearer token where there is no additional "identity" required to be provided along-side the token to authentication (such as Slack) then we recommend setting the identity to something unique for the provider. The Slack enterprise identifier is a good choice here we find.

identity

  • The identity portion of the credential used to authenticate with the service or application. This may be a username, a realm, etc.
  • If no identity is required, such as in the case of an API that uses a Bearer token for authentication, this MUST still be set. In this case, this may set to any unique value associated with the provider - such as enterprise identifier, account name, etc.

Via https://hashicorp-forge.github.io/grove/configuration.html#required-fields

--

For Slack, based on their documentation only Bearer tokens for authentication appear to be supported, and the Slack organisation must have an appropriate plan to enable the audit API:

Please note the Audit Logs API is only available to Slack workspaces on an Enterprise Grid plan. These API methods will not work for workspaces on a Free, Standard, or Business+ plan.

Via https://api.slack.com/admins/audit-logs

Endpoints that require authentication must include an OAuth token as an Authorization header with a type of Bearer. The token must be a Slack user token (beginning with xoxp) associated with an Enterprise Grid organization owner

Via https://api.slack.com/admins/audit-logs-call

However, for other connectors the authentication method can be a number of different options depending on what the vendor supports and expects.

--

The "No value found in cache" is expected when first run as there will not yet be any pointers stored in the cache, but "Connector was unable to collect logs" definitely indicates an error is occurring.

Could you provide the logs you're receiving when attempting to perform this collection? There should hopefully be some additional information in an exception field on the log message that should help to pinpoint where the collection is failing :)

bojiang1990 commented 1 year ago

Thanks a lot for the detailed instructions. Here is the message I got: "exception": "401 Client Error: Unauthorized for url: https://api.slack.com/audit/v1/logs?oldest=1677652568&limit=1000". It looks like the authorization of my current credential is problematic. For slack, I created a bot token by creating a new app. I found our company id in slack and used it as the "identify". I'm just trying to pull one real logs using grove, and the app is not limited to be slack. Let me know if the credentials of other apps are easier to start with.

hcpadkins commented 1 year ago

Hey there,

It seems that the token may not have the required permissions?

For Slack, the auditlogs:read OAuth scope must be granted to the token, and the user "associated with an Enterprise Grid organization owner" account. Slack's own guidance on this process can be found here.

In terms of complexity, all "Bearer" token type connectors - such as Slack - typically require the least configuration. As long as the Slack organisation has an appropriate plan, and the token granted the auditlogs:read scope, the logs should flow without any additional configuration :)

hcpadkins commented 1 year ago

Hey there,

Sorry for the delay in response here! The Cookie cutters for plugins and connectors can be found below:

To use these, you'll need to clone the Grove repository, and perform the following - as always, we recommend working in a Python virtual environment.

  1. Ensure the Grove repository is cloned.
  2. Ensure cookiecutter is installed via pip (pip install cookiecutter).
  3. Change into the templates/code directory in the cloned Grove repository.
  4. Run cookie cutter (cookiecutter ./).
  5. Answer the questions prompted by Cookie Cutter.

This should generate a new directory containing your new connector. Inside you should find everything preconfigured and ready to develop.

You can pip install this newly created connector into your Python environment, which will automatically ensure that Grove is installed. I would personally recommend using an editable install (pip install -e .) during development to simplify your testing cycle slightly.

Once installed you should be able to call your new connector from Grove by adding a connector configuration which uses this newly created connector by its name - all of the Setup Tools entrypoints will be configured automatically in the created Python project by Cookie Cutter.

We'll be adding some additional documentation on this in the near future, but in the mean time please see the Internals and API documentation which includes some useful information that may be of assistance during development.

I hope this helps, and good luck with your first connector!

bojiang1990 commented 1 year ago

Hey there,

Thanks again for your detailed instruction, our company is very interested in it, and has decided to fork it and build our own project based on it. However, I followed your steps to create new connectors and encountered the following error when installing the new connector: ERROR: Could not find a version that satisfies the requirement grove<2.0,>=1.0.0 (from grove-connectors-my-connector) (from versions: 0.0.0, 1.0.0rc1) ERROR: No matching distribution found for grove<2.0,>=1.0.0.

I wonder if that is an issue related to the current grove version. Any hint on fixing this?

Looking forward to hearing from you.

Best Regards,

Bo.

On Tue, Mar 28, 2023 at 4:50 AM Peter Adkins @.***> wrote:

Hey there,

Sorry for the delay in response here! The Cookie cutters for plugins and connectors can be found below:

To use these, you'll need to clone the Grove repository, and perform the following - as always, we recommend working in a Python virtual environment https://docs.python.org/3/library/venv.html.

  1. Ensure the Grove repository is cloned.
  2. Ensure cookiecutter is installed via pip (pip install cookiecutter).
  3. Change into the templates/code directory in the cloned Grove repository.
  4. Run cookie cutter (cookiecutter ./).
  5. Answer the questions prompted by Cookie Cutter.

This should generate a new directory containing your new connector. Inside you should find everything preconfigured and ready to develop.

You can pip install this newly created connector into your Python environment, which will automatically ensure that Grove is installed. I would personally recommend using an editable install https://setuptools.pypa.io/en/latest/userguide/development_mode.html(pip install -e .) during development to simplify your testing cycle slightly.

Once installed you should be able to call your new connector from Grove by adding a connector configuration https://hashicorp-forge.github.io/grove/configuration.html#connectors which uses this newly created connector by its name - all of the Setup Tools entrypoints will be configured automatically in the created Python project by Cookie Cutter.

We'll be adding some additional documentation on this in the near future, but in the mean time please see the Internals https://hashicorp-forge.github.io/grove/internals.html and API https://hashicorp-forge.github.io/grove/api.html documentation which includes some useful information that may be of assistance during development.

I hope this helps, and good luck with your first connector!

— Reply to this email directly, view it on GitHub https://github.com/hashicorp-forge/grove/issues/21#issuecomment-1486722820, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5XW7DCMLZO7TFDURHGCOHLW6LGADANCNFSM6AAAAAAVS7OBM4 . You are receiving this because you authored the thread.Message ID: @.***>

hcpadkins commented 1 year ago

Hey there,

That's great to hear!

You're absolutely correct about the version. As only a pre-release has been cut for the moment, the version constraint will need to specify to allow release candidates to ensure that a correct version can be found. You can do this by updating the version constraint in your newly created connector's setup.py to the following:

install_requires=[
    "grove>=1.0.0rc1,<2.0",  # Note the addition of 'rc1'.
]

This should not be required for too much longer, so we will leave the template in the repository as is for the moment :)

Thank you again for raising this issue, and good luck with your connectors!

bojiang1990 commented 1 year ago

Hi there, that works, and I can now install the connectors. However, when I run grove, the following error pops up: "exception": "Requested handler could not be found with name 'jumpcloud_audit_trails' (group 'grove.connectors')".

Here is a list of things I did to create the jumpcloud connector: First way:

  1. I created a folder under grove/grove/connectors, called jumpcloud. Inside this folder, there are three python files, "init", "api.py" and "audit_trails.py", where each file contains similar content as other connectors but is modified according to the Jump Cloud API.
  2. I added a register line in the entry_points = {"grove.connectors":["jumpcloud_audit_trails = grove.connectors.jumpcloud.audit_trails:Connector"]} in the setup.py file in the main folder.
  3. I installed grove so the new handler is registered by: pip install -e .

Second way:

  1. Generate connectors by running the cookie cutter.
  2. Modify the API parts so it pulls logs from Jump Cloud
  3. cd to the created connector's folder and pip install -e .

I stored my configs temporarily in the config folder and set the export path to it.

If my understanding is correct, I should be able to collect logs from Jump Cloud similar to from other sources. But the error popped up during the grove run.

Please let me know if I missed something.

Thanks!

Best,

Bo

On Tue, Apr 11, 2023 at 3:15 AM Peter Adkins @.***> wrote:

Hey there,

That's great to hear!

You're absolutely correct about the version. As only a pre-release has been cut for the moment, the version constraint will need to specify to allow release candidates to ensure that a correct version can be found. You can do this by updating the version constraint in your newly created connector's setup.py to the following:

install_requires=[ "grove>=1.0.0rc1,<2.0", # Note the addition of 'rc1'. ]

This should not be required for too much longer, so we will leave the template in the repository as is for the moment :)

Thank you again for raising this issue, and good luck with your connectors!

— Reply to this email directly, view it on GitHub https://github.com/hashicorp-forge/grove/issues/21#issuecomment-1503052493, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5XW7DB75SLPSKFOAABO7V3XAUVMJANCNFSM6AAAAAAVS7OBM4 . You are receiving this because you authored the thread.Message ID: @.***>

hcpadkins commented 1 year ago

Hey there,

Sorry for the delay in reply! This error indicates that the plugin is not discoverable.

Can you please verify that if you open a Python REPL in the same virtual-environment that you're working in, that you're able to import the new connector directly? Additionally, if you could dump a list of entry_points known this should help track down what's going on here.

For completeness, the following should work to get the required information to investigate further :)

# REPL loaded via 'ipython' or 'python' from the shell, depending on your preference.

# Load the connector manually for validation.
from grove.connectors.jumpcloud import audit_trails

# Output the NAME of the connector, just to be sure.
print(audit_trails.Connector.NAME)

# Output all 'grove.connectors' entry_points.
from importlib.metadata import entry_points

for candidate in entry_points().get("grove.connectors", []):
    print(candidate.name)
bojiang1990 commented 1 year ago

Hi Guys, we have successfully debugged it, and we are moving to the phase to deploy grove! One question I have here is how cache handler works. To be more specific, we want to store pointers in aws_dynamoDB. How should I create a table? For now, I created a table with PK as the connector's name. Then set the table name as one of the the env. variable. But the following error showed up"Unable to get value from cache. An error occurred (ResourceNotFoundException) when calling the GetItem operation: Requested resource not found". For the first collection, will it automatically setup the PK, SK name? Thanks!

hcpadkins commented 1 year ago

Hey there,

That's great news, we're glad to hear it! :)

We use Terraform to create our DynamoDB tables along with the rest of the Grove infrastructure at deployment time. If you're not using Terraform, you can just create your DynamoDB table manually via the AWS API or console.

Here's a sample of our configuration:

https://github.com/hashicorp-forge/grove/blob/main/templates/deployment/terraform-aws-ecs/modules/grove/dynamodb.tf#L5

hcpadkins commented 1 year ago

Closing this one our for now, but please feel free to open a new issue if you encounter any further issues :)