Tyrrrz / DiscordChatExporter

Exports Discord chat logs to a file
MIT License
7.48k stars 682 forks source link

Optimize fetching threads #1125

Closed slatinsky closed 1 year ago

slatinsky commented 1 year ago

By adding new threads support, Fetching channels... stage takes too long for bigger guilds with many threads. Some data doesn't need to be fetched, because we know, that those channels/threads won't be exported.

If --after parameter is specified, the main speed-up is to break early of thread fetching, by sorting threads by LastMessageId (endpoint channels/{channel.Id}/threads/search - query parameters sort_by=last_message_time&sort_order=desc) and comparing After and LastMessageId. This check is implemented only for user tokens

Skip thread export for channel if:

Skip export of these ordinary channels if:


Potential problems:

Tested - CLI json export using exportGuild with --include-threads All and combinations of --before and --after parameters

Tyrrrz commented 1 year ago

Thanks!

Tyrrrz commented 1 year ago
  • Old bug - timestamp supplied using --before can overflow into the future - for example --before "2010-08-30T00:00:00.000Z" overflows to snowflake 16548918821562351616 (2140-01-12T07:35:11.104Z). By adding more early checks, there may be more cases, where this bug can be an issue

Can you please also make a PR for that in Snowflake.FromDate(...) method (or wherever applicable)? We can assume that a valid snowflake should be at least after Discord was launched.