Tyrrrz / DiscordChatExporter

Exports Discord chat logs to a file
MIT License
7.78k stars 706 forks source link

Export JSON takes exceptionally long #502

Closed Fyko closed 3 years ago

Fyko commented 3 years ago

Hello! I'm the maintainer of https://github.com/fyko/export-api With the upcoming release of our v2 API, users will be able to export to JSON.

During testing, it took 79 seconds to export 300 messages. The resulting file was about 300kb, but I don't see why it should take this long. I'm not extremely confident in my C# skills, so I couldn't direct exactly where this is coming from.

Tyrrrz commented 3 years ago

That sounds abnormal. Writing JSON (or any other format) should not take that long. Are you sure it's not a latency issue? Is it consistent? Are the other formats faster to export?

Fyko commented 3 years ago

After some benchmars, here are some results I came to:

  1. Plaintext (0) - 1584ms
  2. HTML Dark (1) - 2322ms
  3. HTML Light (2) - 1578ms
  4. CSV (3) - 1482ms
  5. JSON (4) - 73455ms (73.455 seconds)

Edit: after some more handy stopwatch work, the problem looks to be with writing each individual message -- taking anywhere from 80-110ms

Edit 2: I've added a bunch of diag - https://paste.nomsy.net/awusetofaq.cs; example output:

Starting 642601334046064650
Wrote 642601334046064650 meta in 0ms
Wrote 642601334046064650 content in 0ms
Wrote 642601334046064650 author data in 0ms
Wrote 642601334046064650 attachments in 0ms
Wrote 642601334046064650 embeds in 0ms
Wrote 642601334046064650 reactions in 0ms
Wrote 642601334046064650 mentions in 0ms
Flushing 642601334046064650
Flushed in 0ms
Wrote message 642601334046064650 in 91ms
Tyrrrz commented 3 years ago

Hm, so is flush taking up most of that time? Everything is 0ms all the way until the end. No idea what could be causing it.

Fyko commented 3 years ago

I don't think it's flushing because the log says it takes 0ms.

Fyko commented 3 years ago

I stand corrected, after a little rewrite of the diag:

Starting 642601055862915112
Wrote 642601055862915112 meta after 0ms
Wrote 642601055862915112 content after 0ms
Wrote 642601055862915112 author data after 0ms
Wrote 642601055862915112 attachments after 0ms
Wrote 642601055862915112 embeds after 0ms
Wrote 642601055862915112 reactions after 0ms
Wrote 642601055862915112 mentions after 0ms
Flushed after 91ms
Wrote message 642601055862915112 after 91ms
Fyko commented 3 years ago

I'm not sure if anything can be done to fix this

Tyrrrz commented 3 years ago

If the biggest delay comes from flushing, that means it's on the side of System.Text.Json. However, what you're describing is abnormal, but I don't know what could be causing it. Maybe try to export to a file instead and see if it's faster? If it is, then the problem may be with the stream.

Tyrrrz commented 3 years ago

Nevermind, I see you're exporting to a file, thought you were streaming directly. I have no clue then 🤷

r1bnc commented 3 years ago

Is this not fixed yet?

Tyrrrz commented 3 years ago

@r1bnc did you fix it?

r1bnc commented 3 years ago

? i was just asking whether this issue was fixed.

96-LB commented 3 years ago

Has anyone else encountered this? I haven't been able to reproduce it. @Fyko, does this happen on every channel that you export, or just one/a few?