temporalio / roadrunner-temporal

Temporal PHP-SDK Host Process plugin for Roadrunner
MIT License
22 stars 8 forks source link

[🐛 BUG]: RR doesn't send `identity` field when Workflow Update is completed #537

Open roxblnfk opened 3 months ago

roxblnfk commented 3 months ago

No duplicates 🥲.

What happened?

Other SDK's send identity field when complete an update. RR doesn't. image

Existing of the field also affects how it will be showed in UI:

image

RR Version

v2024.1.5

Relevant log output

No response

rustatian commented 3 months ago

Hello @roxblnfk 👋 Identity is the property of the Worker options, not the Update callback. It should be added (my guess, since PHP knows about the identity) somewhere on the PHP side, since I don’t have the possibility to inject individual fields into the update command.

roxblnfk commented 3 months ago

Looks like PHP SDK doesn't send identity to RR in all the responses. But I see that other events contain identity inside (UPDATE_ACCEPTED contains it). PHP SDK sends identity only in Client calls, that's why I think it's not a PHP SDK issue.

Also there are different identity values in RR case and it looks like a bug. Samples: history-python.json history-php.json

rustatian commented 3 months ago

Identity data should be somewhere in the proto Payload or Headers, I think. This is not an RR behavior, RR has only two methods: Reject and Complete update (there is no Validate update command). They’re accepting Failue or Payload (protobuf). If PHP-SDK doesn’t include this information in these payloads, from not only the RR side, but from the Go SDK side, there is no way (from the external API POV) to include this information.

Also there are different identity values in RR case and it looks like a bug.

Worker’s Identity in Python is some random number @ Name of your PC, while in RR, identity is a TaskQueue + UUID4 string that uniquely identifies each worker. Thus, you see 953@roxblnfk-pepelaz in Python and features-update/task_failure-96dcdc25-911e-464e-8998-ec001791ddc3:00c4dfc8-e4f8-485d-ab6e-399fb450774e (TaskQueue-UUID4) for the RoadRunner.

rustatian commented 3 months ago

To move further, I propose you to check the other proto Payload / Header you're sending to RR. I can guess, since you have the identity info in the first UPDATE_ACCEPTED call, the other calls should also contain some information missing from the first call.

roxblnfk commented 3 months ago

... in RR, identity is a TaskQueue + UUID4 string that uniquely identifies each worker....

Yah, it's clear. But I meant another thing: in the PHP history two values of identity: TaskQueue + UUID4 (from RR) and pid@hostname.

I've made a few checks and I see that pid@hostname values were written from client requests. I suppose it's OK in case UPDATE_ACCEPTED.

To move further, I propose you to check the other proto Payload / Header you're sending to RR

As I said before, PHP SDK sends identity only with Client requests. I've checked all the data that was sent from PHP to RR.

rustatian commented 3 months ago

pid@hostname might be set by the PHP. RR sets its ID in case of empty identity. This can be a sticky worker ID as well.

rustatian commented 3 months ago

Could you please attach RR logs (debug) as well in the description? Also, there is no such RR version 5.0.0, rr-temporal plugin v5.0.0 is also not released yet (with RR). Please update the description to include the actual RR version (would be nice if you test on the latest RR 2024.1.5 release, since I bumped the temporal API). If I understand correctly, there is a failure in your screenshots. What type of failure_info are you sending back to the RR? Keep in mind that some of the failure_info data contains identity.

EDIT: Just for the sake of experiment, try to send an ActivityFailureInfo and include identity in it.

rustatian commented 1 month ago

Ping @roxblnfk ^

roxblnfk commented 1 month ago

That issue has low priority for now.