Tracing outside of IO - Githubissues

mpardalos commented 4 years ago

I'm currently trying to instrument my language, kima using opentelemetry and I'm facing the issue that most interesting areas to add spans to are not in IO.

Would it be possible to relax the constraints for withSpan to just MonadMask or maybe provide a separate version that works like that?

I assume that it wouldn't really be possible from an entirely pure context because we can't guarantee the order, but I think it should be possible in monads without MonadIO at least.

ethercrow commented 4 years ago

I don't think monads by themselves guarantee any evaluation order. It might still work to some extent, there may be an issue with endSpan event being emitted before the corresponding beginSpan one, but that may be fixable with some usage of seq. Please feel free to experiment if you have time.

mpardalos commented 4 years ago

I'm currently testing this with BangPatterns and this code seems to be working:

withSpanM_ :: Monad m => String -> m a -> m a
withSpanM_ name action = do
    let !s = unsafePerformIO (beginSpan name)
    res <- action
    let !_ = unsafePerformIO (endSpan s)
    return res

However there is a good chance that it just happens to and it wouldn't work with some different code inside the span. I'm not really sure about how to get the evaluation order I want in GHC or what I could test to see if this is working. Any pointers?

Also, I'm using unsafePerformIO here to avoid altering OpenTelemetry's code but I think it could be avoided if it used traceEvent instead of traceEventIO

ethercrow commented 4 years ago

First thing that comes to mind is that when your code is pure you should want GHC's native profiling instead of tracing.

yaitskov commented 4 years ago

Is it another appealing case to an effect library? We need to transfer trace id somehow anyway. Right now it is mixed with business logic arguments. I looked at co-log library. Logging library is pretty close to OpenTelemetry. It is using very advanced types for its functions.

mpardalos commented 4 years ago

@ethercrow, I think the two have different uses. You can much more easily get a "high-level" overview of performance with tracing than GHC's profiler.

In my case, I want tracing to see (A) which compiler phases are worth optimising and (B) which functions in the source are particularly taxing to compile. Both of these are hard to do with just profiling, (B) especially because I want to separately profile invocations of a function based on the data (the source program). This is trivial with tracing but hard with profiling. Since basically my whole compiler is pure (but monadic) I would find this very useful.

mpardalos commented 4 years ago

I've also realised there is another issue with making beginSpan pure, other than the trace. newUnique also requires IO and it seems like a harder problem to solve.

ethercrow commented 4 years ago

Maybe consider a separate main for tracing that looks like this:

!ir1 <- withSpan "phase1" (pure $!! phase1 ir0)
!ir2 <- withSpan "phase2" (pure $!! phase2 ir1)
...

where phase0 and phase1 are still pure functions.

With GHC it's pretty likely that if your code looks like phase2 (phase1 ir0), there's no clear phase separation at runtime. Something is inlined, something is eliminated, something is reordered.

@mpickering what do you thing about tracing of pure things? Does GHC's withTiming thing deal with those somehow?

mpardalos commented 4 years ago

After the conversations we've had I think that this is not really compatible with laziness. I think it's best to just stick to doing logging in IO only.

Closing

ethercrow / opentelemetry-haskell

Tracing outside of IO #22