infinyon / fluvio

Lean and mean distributed stream processing system written in rust and web assembly.
https://www.fluvio.io/
Apache License 2.0
2.77k stars 199 forks source link

feat(fluvio-cli): add base offset to partition list command #4059

Open PanGan21 opened 3 weeks ago

PanGan21 commented 3 weeks ago

Adds base offset to fluvio partition list command. Related: #3975

digikata commented 3 weeks ago

@EstebanBorai could you take a look at this https://github.com/infinyon/fluvio/actions/runs/9438543985/job/25995712397?pr=4059. Offhand this may be from the test asserts needing to catch up to the optimized fvm update change

EstebanBorai commented 3 weeks ago

@EstebanBorai could you take a look at this https://github.com/infinyon/fluvio/actions/runs/9438543985/job/25995712397?pr=4059. Offhand this may be from the test asserts needing to catch up to the optimized fvm update change

Giving it a check!

EstebanBorai commented 3 weeks ago

@EstebanBorai could you take a look at this https://github.com/infinyon/fluvio/actions/runs/9438543985/job/25995712397?pr=4059. Offhand this may be from the test asserts needing to catch up to the optimized fvm update change

Giving it a check!

Update! Here Im working out a fix to take in consideration the same version tests: https://github.com/infinyon/fluvio/pull/4062

PanGan21 commented 2 weeks ago

Hey @digikata thanks for the assistance already. Do you maybe know why upgrading doesn't work according to this pipeline? https://github.com/infinyon/fluvio/actions/runs/9456228110/job/26051451364?pr=4059

digikata commented 2 weeks ago

Looking at the code again, I suspect base_offset needs to update k8-util/helm/fluvio-sys/templates/crd_partition.yaml - at least in the status additionalPrinterColumns section. Though I'm not sure this is the fault.

@PanGan21 or @EstebanBorai, one way to try to get the error under a better analysis environment is to setup a k8s with stable fluvio and upgrade it. Otherwise most of this help is a shot in the dark. (Unfortunately, I don't think I have time to do that until next week).

EstebanBorai commented 2 weeks ago

Looking at the code again, I suspect base_offset needs to update k8-util/helm/fluvio-sys/templates/crd_partition.yaml - at least in the status additionalPrinterColumns section. Though I'm not sure this is the fault.

@PanGan21 or @EstebanBorai, one way to try to get the error under a better analysis environment is to setup a k8s with stable fluvio and upgrade it. Otherwise most of this help is a shot in the dark. (Unfortunately, I don't think I have time to do that until next week).

Thanks for the recommendation @digikata! I will give this a shot and get back to this PR with news.

EstebanBorai commented 1 week ago

Hi @PanGan21! Im taking a look at this right now.

I can see the problem is that baseOffset is missing in Partitions CRD when fetching Kubernetes for the PartitionSpec.

The error looks like:

2024-06-24T19:47:14.987321Z ERROR MetadataDispatcher{spec="Partition" namespace="default"}: fluvio_stream_dispatcher::dispatcher::metadata: error with reconciliation loop: Custom {
    kind: InvalidData,
    error: "error retrieving k8: missing field `baseOffset` at line 1 column 1764",
}, sleep 10 seconds
PanGan21 commented 1 week ago

PartitionSpec

Thanks for taking a look @EstebanBorai . Since we added base offset on the PartitionStatus I am wondering why it should also be in PartitionSpec. Also how did you manage to get this log? By upgrading a local cluster?

From your experiment on #4077 I also see that I am probably missing #[cfg_attr(feature = "use_serde", serde(default))] on the base_offset. Should I add this as well?

EstebanBorai commented 1 week ago

PartitionSpec

Thanks for taking a look @EstebanBorai . Since we added base offset on the PartitionStatus I am wondering why it should also be in PartitionSpec. Also how did you manage to get this log? By upgrading a local cluster?

From your experiment on #4077 I also see that I am probably missing #[cfg_attr(feature = "use_serde", serde(default))] on the base_offset. Should I add this as well?

Hi @PanGan21!

You are right, thats an approach to fix the missing baseOffset error when parsing previous cluster CRDs on cluster upgrade.

EstebanBorai commented 6 days ago

@PanGan21 Im working on state update improvements so we can land your PR!

Will get back to you in this PR when these changes are applied!