Closed alex-stoica closed 4 days ago
@silvanocerza do you want to take over this PR review?
Nice catches, @vblagoje. Solved them
@alex-stoica thanks once again for your efforts. We seem to have caused a small regressions where spans don't have latency attached to them. I'm not talking about time-to-first-token that @LastRemote recently worked on but this particular change in span management and context stack handling in the trace method that have likely lead to spans not having the correct latency information.
Here's how this latency was available on Langfuse before:
as well as in generation span:
While this information is not available when running your PR branch:
Would you please have a look where/how did we lose this important info. For the above screenshots I used chat.py in example directory of the langfuse integration. Perhaps you can as well during your dev/debug cycles. Looking forward to your fixes here. 🙏
@vblagoje Indeed, I had overlooked the end()
method call, and it turned out to be a bit trickier than I initially expected. At first, I tried updating the span with start_time
and end_time
, but after taking a closer look, I realized that end()
effectively addressed the issue
I've also tested locally with chat.py
with AzureOpenAIChatGenerator
and it worked as expected.
@alex-stoica Thanks for making this change! I added the parent_span parameter in Haystack 2.7 with the intention to fix a bug where Langfuse traces overlap with each if there are two or more pipeline.run() running concurrently (which is very common in deployed pipelinements), but I haven't had a chance to update the Langfuse integration part since then.
Your changes look good, although I'd recommend we always create a new root span if parent_span
is None to prevent the concurrency issue (there would be cases where the 2nd pipeline.run has parent_span
is None with a non-empty current_span
, resulting the follow-up trace be registered as a span
instead of trace
). Also I suppose you don't have to deal with meta this way.
Here is something I have in my draft:
tracing_context: ContextVar[Dict[Any, Any]] = ContextVar("tracing_context", default={})
"""
A context variable containing information related to tracing. This can be used to store trace, user, and session IDs,
custom tags and pipeline version information.
"""
class LangfuseTracer(Tracer):
@contextlib.contextmanager
def trace(
self, operation_name: str, tags: Optional[Dict[str, Any]] = None, parent_span: Optional[Span] = None
) -> Iterator[Span]:
"""
Start and manage a new trace span.
:param operation_name: The name of the operation.
:param tags: A dictionary of tags to attach to the span.
:param parent_span: The parent span to use for the newly created span.
:return: A context manager yielding the span.
"""
# pylint: disable=protected-access
# This is to make it consistent with the original implementation
tags = tags or {}
span_name = tags.get("haystack.component.name", operation_name)
if not parent_span:
if operation_name != "haystack.pipeline.run":
logger.warning(
"Creating a new trace without a parent span is not recommended for operation '%s'.", operation_name
)
# Create a new trace if no parent span is provided
span = LangfuseSpan(
self._tracer.trace(
name=self._name,
public=self._public,
id=tracing_context.get().get("trace_id"),
user_id=tracing_context.get().get("user_id"),
session_id=tracing_context.get().get("session_id"),
tags=tracing_context.get().get("tags"),
version=tracing_context.get().get("version"),
)
)
else:
if tags.get("haystack.component.type") in self.ALL_SUPPORTED_GENERATORS:
span = LangfuseSpan(parent_span.raw_span().generation(name=span_name))
else:
span = LangfuseSpan(parent_span.raw_span().span(name=span_name))
self._context.append(span)
span.set_tags(tags)
yield span
... # The rest of the code is left unchanged
I hope this helps.
@LastRemote I see where you’re coming from. I also envisioned some "extra-storage" in the tracer (https://github.com/deepset-ai/haystack-core-integrations/issues/1154#issuecomment-2472993625), however I was unaware of the concurrency issues as the fix that allowed hayhooks to run concurrently different run()
for the same pipeline was very recently merged https://github.com/deepset-ai/hayhooks/pull/27
Unfortunately, I will be unavailable for the next 3-4 days, so I recommend either integrating the fix as it stands or extending it with the context variable proposal you mentioned, provided @vblagoje or someone else from Haystack team approves. In this case I think you can add a commit to extend this PR.
I think the people involved in the Langfuse integration (both from Haystack and external contributors like us) should be on the same page on multiple issues. In this case, I believe the context var approach could also be an opportunity to explore how it might assist in linking generators with their corresponding prompts, but I need to think a little bit more about it
Thank you for jumping in @silvanocerza and kudos to both of you @LastRemote and @alex-stoica. Keep these PRs coming!
I think the people involved in the Langfuse integration (both from Haystack and external contributors like us) should be on the same page on multiple issues. In this case, I believe the context var approach could also be an opportunity to explore how it might assist in linking generators with their corresponding prompts, but I need to think a little bit more about it
@alex-stoica I agree. I hope there is a mailing list or a well-maintained GitHub label or maybe a Discord channel to track all ongoing issues and feature requests (personally I did a bad job maintaining issues though). We can bring this up to the community managers and see how this can be done.
Related Issues
span._data
)Proposed Changes:
The primary issue was caused by the missing
parent_span
parameter inLangfuseTracer.trace
, as explained in the original issue discussion.Additionally, after updating the function signature to accept
parent_span
,meta
was not found anymore as the code didn’t rely on it directly. The solution involves creating a root span when on creating a root span whenparent_span
does not exist or a child span when it does.How did you test it?
Locally with the following setup:
And with a custom generator
Notes for the reviewer
I’ve left some commented code for a potential future implementation related to https://github.com/deepset-ai/haystack-core-integrations/issues/1154