Tyrrrz / DiscordChatExporter

Exports Discord chat logs to a file
MIT License
7.82k stars 713 forks source link

Chat thread missing original message thread was created from #1265

Open realkosty opened 4 months ago

realkosty commented 4 months ago

Version

v2.43.3

Flavor

CLI (Command-Line Interface)

Platform

MacOS

Export format

TXT

Steps to reproduce

https://discord.gg/XYVZf6we (public channel)

./DiscordChatExporter.Cli exportguild -t "$DISCORD_TOKEN" -g 621778831602221064 --after "2024-06-11 00:00" --before "2024-06-14 00:00" --include-threads all -f PlainText -o ~/Desktop/Chat-thread-missing-original-%G-%a-%b

Locate file

Details

Actual result:

Note [6/12/2024 6:03 PM] charonthegondolier missing

==============================================================
Guild: Sentry Community
Channel: Sentry / 🪀|chat / When grouping things based on callstack
After: 6/23/2023 12:00 AM
==============================================================

[6/12/2024 7:33 PM] dhrumilpm

[6/12/2024 7:33 PM] dhrumilpm
Hi do you mind providing some more details about your use case, what are things in your stacktrace that we should be avoiding?

You can check this doc out to see if custom fingerprinting can help solve your use:

https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/

{Embed}
https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/
Fingerprint Rules
Learn about fingerprint rules, matchers for fingerprinting, how to combine matchers, and using variables for fingerprinting.
https://images-ext-1.discordapp.net/external/jFgbTN27f4PjB2EGzrUacwBAzX1TBZ0D_7VPUdNhUmc/https/sentry-docs-cc5bz9ym0.sentry.dev/meta.jpg

[6/13/2024 6:45 AM] charonthegondolier
It's mostly the stuff at the very beginning,
FEngineLoop::Tick vs. FTaskGraphCompatibilityImplementation::ProcessThreadUntilRequestReturn

These are two originating points where the code can end up being called, but these are things like 30 steps down the frame, i don't care about the stuff *that* far back in the frame.

==============================================================
Exported 3 message(s)
==============================================================

Expected result

==============================================================
Guild: Sentry Community
Channel: Sentry / 🪀|chat / When grouping things based on callstack
After: 6/23/2023 12:00 AM
==============================================================

[6/12/2024 6:03 PM] charonthegondolier
When grouping things based on callstack, is there a way to tell sentry where is a useful place in callstacks to begin grouping from, and ignore everything before X?

[6/12/2024 7:33 PM] dhrumilpm
Hi do you mind providing some more details about your use case, what are things in your stacktrace that we should be avoiding?

You can check this doc out to see if custom fingerprinting can help solve your use:

https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/

{Embed}
https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/
Fingerprint Rules
Learn about fingerprint rules, matchers for fingerprinting, how to combine matchers, and using variables for fingerprinting.
https://images-ext-1.discordapp.net/external/jFgbTN27f4PjB2EGzrUacwBAzX1TBZ0D_7VPUdNhUmc/https/sentry-docs-cc5bz9ym0.sentry.dev/meta.jpg

[6/13/2024 6:45 AM] charonthegondolier
It's mostly the stuff at the very beginning,
FEngineLoop::Tick vs. FTaskGraphCompatibilityImplementation::ProcessThreadUntilRequestReturn

These are two originating points where the code can end up being called, but these are things like 30 steps down the frame, i don't care about the stuff *that* far back in the frame.

==============================================================
Exported 3 message(s)
==============================================================

Checklist

Tyrrrz commented 3 months ago

Hi. Can you check if this also happens when you export in HTML? I think the issue might be that the TXT format does not know how to render the "thread start" system event.

realkosty commented 3 months ago

Hi @Tyrrrz, no it looks like HTML behaves the same way:

Thread - EXPECTED (Discord.com)

expected-thread

Thread - ACTUAL (thread export HTML)

Note how this is missing the full initial message, only has truncated version of it as the subject line

actual-thread-HtmlDark

Channel view on Discord.com

expected-channel

Channel export HTML

actual-channel-HtmlDark

Tyrrrz commented 3 months ago

Thanks for the screenshots

realkosty commented 2 months ago

@Tyrrrz could you please point me to the area in the code to look at? or alternatively lmk if you are interested in a commercial arrangement to prioritize this fix on your end

Tyrrrz commented 2 months ago

@realkosty your points of interest are:

In order of execution.

Tyrrrz commented 2 weeks ago

The issue appears to be that the starting message has content: "", but contains the actual message as a referenced message:

{
  "type": 21,
  "content": "",
  "mentions": [],
  "mention_roles": [],
  "attachments": [],
  "embeds": [],
  "timestamp": "2024-11-07T16:40:27.109000+00:00",
  "edited_timestamp": null,
  "flags": 0,
  "components": [],
  "id": "1304123312372453390",
  "channel_id": "1304123293049294940",
  "author": {
    "id": "128178626683338752",
    "username": "tyrrrz",
    "avatar": "d32cfe56f68f523cda67c9c5a3ef57aa",
    "discriminator": "0",
    "public_flags": 0,
    "flags": 0,
    "banner": null,
    "accent_color": null,
    "global_name": "Tyrrrz",
    "avatar_decoration_data": null,
    "banner_color": null,
    "clan": null
  },
  "pinned": false,
  "mention_everyone": false,
  "tts": false,
  "message_reference": {
    "type": 0,
    "channel_id": "1304123107845476363",
    "message_id": "1304123293049294940",
    "guild_id": "866458392705105940"
  },
  "position": 0,
  "referenced_message": {
    "type": 0,
    "content": "Thread starting message with special characters ? \" / | >",
    "mentions": [],
    "mention_roles": [],
    "attachments": [],
    "embeds": [],
    "timestamp": "2024-11-07T16:40:22.502000+00:00",
    "edited_timestamp": null,
    "flags": 32,
    "components": [],
    "id": "1304123293049294940",
    "channel_id": "1304123107845476363",
    "author": {
      "id": "128178626683338752",
      "username": "tyrrrz",
      "avatar": "d32cfe56f68f523cda67c9c5a3ef57aa",
      "discriminator": "0",
      "public_flags": 0,
      "flags": 0,
      "banner": null,
      "accent_color": null,
      "global_name": "Tyrrrz",
      "avatar_decoration_data": null,
      "banner_color": null,
      "clan": null
    },
    "pinned": false,
    "mention_everyone": false,
    "tts": false,
    "thread": {
      "id": "1304123293049294940",
      "type": 11,
      "last_message_id": "1304123315022991390",
      "flags": 0,
      "guild_id": "866458392705105940",
      "name": "Thread starting message with special",
      "parent_id": "1304123107845476363",
      "rate_limit_per_user": 0,
      "bitrate": 64000,
      "user_limit": 0,
      "rtc_region": null,
      "owner_id": "128178626683338752",
      "thread_metadata": {
        "archived": false,
        "archive_timestamp": "2024-11-07T16:40:27.071000+00:00",
        "auto_archive_duration": 4320,
        "locked": false,
        "create_timestamp": "2024-11-07T16:40:27.071000+00:00"
      },
      "message_count": 1,
      "member_count": 1,
      "total_message_sent": 1
    }
  }
}

I think Discord uses the same approach for forwarded messages as well. We already parse this data, so we need to figure out how to render it properly.