livekit / protocol

LiveKit protocol. Protobuf definitions for LiveKit's signaling protocol
https://docs.livekit.io
Apache License 2.0
73 stars 62 forks source link

Rename participant_identity in Transcription #706

Closed dennwc closed 4 months ago

dennwc commented 5 months ago

There are two fields named participant_identity in context of Transcription. It's not clear which one corresponds to the transcriber agent, and which one is for the participant which got its speech transcribed.

This change renames the Transcription field to transcribed_participant_identity so to clarify that it's used for the transcribed participant.

changeset-bot[bot] commented 5 months ago

🦋 Changeset detected

Latest commit: c43fc1d52d860faee13457738e703645ea7dd2f1

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

💥 An error occurred when fetching the changed packages and changesets in this PR ``` Some errors occurred when validating the changesets config: The package or glob expression "github.com/livekit/protocol" specified in the `fixed` option does not match any package in the project. You may have misspelled the package name or provided an invalid glob expression. Note that glob expressions must be defined according to https://www.npmjs.com/package/micromatch. ```
theomonnom commented 5 months ago

Do we need the usecase where participant_identity is the transcriber, and the participant_identity inside the transcription is the participant that got its speech transcribed?

dennwc commented 5 months ago

That's a very good question :thinking:

The shared identity in DataPacket is used by all SDKs to route this data packet to a specific RemoteParticipant handler. In that sense, it would be easier for end users if this would be the identity of the transciption (same as audio pre-transcription).

But the problem with this approach is that currently LK server will force-set DataPacket.participant_identity to match identity of the transcription agent. We could special-case transcriptions, but this would kind of defeat the purpose of a shared field that always means the same thing.

Thus, what you propose makes more sense. So DataPacket.participant_identity is the agent who made the transcription, while Transcription.participant_identity is the identity of the participant's audio that was transcribed.

If that make case, let me rename Transcription.participant_identity to transcribed_participant_identity and remove the compatibility code from https://github.com/livekit/client-sdk-js/pull/1130.

biglittlebigben commented 5 months ago

Do we need the usecase where participant_identity is the transcriber, and the participant_identity inside the transcription is the participant that got its speech transcribed?

This is indeed the intent of the extra field.

biglittlebigben commented 5 months ago

I have no strong opinion wrt renaming the field or not.

dennwc commented 4 months ago

@biglittlebigben @theomonnom I removed the notice and renamed the field. WDYT?