kafka-ops / julie

A solution to help you build automation and gitops in your Apache Kafka deployments. The Kafka gitops!
MIT License
421 stars 114 forks source link

Error on first use of S3 Backend #454

Closed samudurand closed 2 years ago

samudurand commented 2 years ago

Describe the bug When running for the first time an existing Julie setup, after adding an S3Backend, I get an error that clearly says that the state file doesn't exist yet in s3.

Error:  2022-03-04 08:43:22.094 [main] S3Backend - software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404, Request ID: XXXXXXX, .....)

To Reproduce Add an S3Backend to a Julie setup that didn't have one before.

topology.builder.state.processor.class=com.purbon.kafka.topology.backend.S3Backend
julie.s3.region=eu-central-1
julie.s3.bucket=mybucket

Expected behavior The backend should create an initial state based on the Julie config instead of failing to initiate

Additional context This looks similar to the other issue I created with the Redis backend. It doesn't seem to be able to handle initial run and needs an existing state before it can work https://github.com/kafka-ops/julie/issues/453

purbon commented 2 years ago

Moin, i have been doing some research in the topic of this issue, I confirm this happens, however I think is just the message that is confusing.

From:

  public BackendState load() {
    try {
      String content = getRemoteStateContent(STATE_FILE_NAME);
      return (BackendState) JSON.toObject(content, BackendState.class);
    } catch (IOException ex) {
      LOGGER.debug(ex);
      return new BackendState();
    }

this should not fail during the first loading, what do you think?

samudurand commented 2 years ago

To make that code work I think you need to catch more than just IO Exceptions, according to the javadocs the NoSuckKey exception does not implement IOException https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/model/NoSuchKeyException.html

Then it should work

purbon commented 2 years ago

Interesting, I think this might be because I test with a dummy S3 system, so it must have been raising different exceptions ... bluddy dummies ;-) jejejje

samudurand commented 2 years ago

I don't know what you use for your testing, but I found that LocalStack is a very good fake AWS test base. For development as well. It reproduces 90% of the main AWS behaviours.

purbon commented 2 years ago

Testing S3 dummy is.

    <dependency>
      <groupId>io.findify</groupId>
      <artifactId>s3mock_2.13</artifactId>
      <version>0.2.6</version>
      <scope>test</scope>
    </dependency>
samudurand commented 2 years ago

Interesting tool, I might use that as well to test my own stuff for simplicity sake. But if you are ready to deal with containers and in Java I recommend the combo TestContainers + LocalStack (they have a premade module for Localstack). Requires no effort to setup and it orchestrate the whole container management side, it's basically like having a local AWS spinned up and down for each test.

purbon commented 2 years ago

I certainly take a look thanks @samudurand

purbon commented 2 years ago

Help me please understand one thing, sorry If I am being too stupid ... I would never discard that point.

https://github.com/kafka-ops/julie/blob/master/src/main/java/com/purbon/kafka/topology/backend/S3Backend.java#L52-L79

is the code that load the remote state, in here if there is an S3Exception (upper exception class to NoSuchKeyException) it gets catch and logged. Then an IOException is raised that is catch as well and it is returned an empty state.

Does that makes sense to you? If I am not too stupid, this would not cause any problem. However, the logging was causing confusion.

PD: I am now certainly going to test with a real S3 bucket ;-) and see how does this behave.

samudurand commented 2 years ago

Ah you are right I hadn't seen the whole code, but then why is it not just accepting the fact it doesn't find a file and just use that returned BackendState ? Unfortunately I cannot reproduce my case anymore at the moment since I found an alternative option. I would suggest closing this issue for now since maybe it's just my own setup that's not quite right. If someone else find that problem again they can always reopen :)

purbon commented 2 years ago

Doing a final test with a real S3, then following your thoughts as it makes total sense! thanks a lot again or your help here @samudurand it is much appreciated!

samudurand commented 2 years ago

You are very welcome, Julie is a great tool!

purbon commented 2 years ago

I tried with a real S3, with the bucket created, but empty. After the test, no error where spoted and

  aws s3 ls s3://pere.julie.ops
2022-03-25 10:31:18       1199 .cluster-state
 pere@fuchsi  ~/work/gitops/kafka-topology-builder   issue-454 ●✚ 

closing this issue for now, feel free to reopen if necessary.