microsoft / botframework-sdk

Bot Framework provides the most comprehensive experience for building conversation applications.
MIT License
7.48k stars 2.44k forks source link

Make it possible to store activity logs as transcript files in Blob Storage #6141

Open peterbozso opened 3 years ago

peterbozso commented 3 years ago

Is your feature request related to a problem? Please describe.

Currently, if you use the TranscriptLoggerMiddleware with the BlobsTranscriptStore implementation, the middleware will log activities to blob storage individually, creating a separate JSON file for each activity inside a folder that is named after the conversation ID. This is very strange to me, because I would expect the exact same behavior as the FileTranscriptLogger just with a different storage medium.

So the Blob Storage implementation should create a transcript file for each conversation (with the conversation ID as it's name) and store all the activities related to that conversation in this file. The benefit of this approach would be (just like in case of the FileTranscriptLogger) that I could download the transcript files from Blob Storage and load them directly into the Emulator on my local machine for troubleshooting/debugging a bot in production.

To me, the current implementation doesn't really feel productive, because if I want to troubleshoot a bot in production, I have to download, then open each JSON file that represents an activity individually and search across them for what I am looking for, which is super cumbersome. Especially if you take into account non-message activities too. Instead of this, it would be much smoother to just download and open a transcript file in the Emulator and quickly see the whole flow of the conversation and most of the time identify very quickly what went wrong. But maybe I am just missing something here and the JSON file-based solution is the better one. In that case, I would be really thankful if you could shed some light on my mistake(s)! :)

Describe the solution you'd like

Since I don't know why the BlobsTranscriptStore is implemented the way it is and I don't see the point in it's current behavior, I would propose a complete rewrite of this class, to behave more like the FileTranscriptLogger does. If there's a reason for it's current behavior and it should be kept because it's useful, then I'd propose adding a flag to this class' constructor which you can use to switch between the current and my above described behavior. Or add a separate implementation of the ITranscriptStore that implements only the desired behavior without touching the BlobsTranscriptStore at all.

I am more than happy to help with the implementation, no matter which option the team thinks is the best!

Describe alternatives you've considered

I searched around and found this article which describes a kind of a workaround at the end, which is a script that you can run locally and it will merge the individual activities' JSON files in Blob Storage into a transcript file on your local machine. This is fine and I can make this work, but I think it would be much more frictionless if the data in Blob Storage would be stored as transcript files already.

joshgummersall commented 3 years ago

@peterbozso this is an issue that cuts across the Bot Framework SDK, so I'll transfer it there.

peterbozso commented 3 years ago

@joshgummersall Okay, thanks! So what's the next step now? As I mentioned in my issue description, I am more than happy to help with the implementation, but there I meant only the C# SDK. (I am comfortable with JS too, I just rather not work with it in my free time. :))

joshgummersall commented 3 years ago

I will consult with the SDK team to decide how to proceed. Thanks for filing and offering to assist with the implementation! We will get back to you once we have discussed it.

EricDahlvang commented 3 years ago

The owner of this is out for vacation until the new year.

garypretty commented 3 years ago

I like this suggestion. @EricDahlvang could you possibly ask the appropriate folks and try and understand if there is a specific reason we have things broken into individual blobs currently?

@peterbozso whilst I like the suggestion here, it isn't something I hear a lot of demand for, so we will need to consider priority once we have an answer to the above. One suggestion might be to go ahead an submit an implementation to the Bot Framework Community (which I currently manage) at https://github.com/BotBuilderCommunity/botbuilder-community-dotnet. This would make the implementation available to folks via NuGet and allow community contribs moving forward. Then, if we think it should be adopted into the SDK we can still do that.

peterbozso commented 3 years ago

@garypretty Thanks for the suggestion, that's a good idea! I am already familiar with the Community project, I added a Postman collection to the tools repo 2 years ago. I'll wait for Eric's answer regarding the current implementation - maybe it's really just me "holding it wrong". :) But if my need is truly valid, I'll go ahead with the .NET implementation in the appropriate Community repo.

garypretty commented 3 years ago

Assigning to @EricDahlvang to comment as per above. Once we have that we can decide if we keep this open or close. Thanks again @peterbozso

EricDahlvang commented 3 years ago

Apologies for the late reply this one.

I think the current blobstranscript store is optimized for writes and modeled after azureblobtranscriptstore, and it does not scale well for reads. I'm interested to look at your suggestions more closely @peterbozso Thank you for raising this.

The transcript store interface should also maybe provide for more control over retrieval. In many scenerios reverse retrieval of transcripts is required, but the implementation does not support this.