Tyrrrz / DiscordChatExporter

Exports Discord chat logs to a file
MIT License
7.65k stars 699 forks source link

Channel export does not include all threads in forum #1129

Closed PGimenez closed 1 year ago

PGimenez commented 1 year ago

Version

latest

Flavor

CLI (Command-Line Interface)

Platform

Docker

Export format

JSON

Steps to reproduce

docker run --rm tyrrrz/discordchatexporter:latest channels -t ${token} -g ${guildId} --include-threads

Details

Thanks for creating this tool, and especially for the new forum thread export feature.

I'm trying to export the threads in the forum of community 774897545717219328 (Genie Framework) with the latest Docker version and the command

docker run --rm tyrrrz/discordchatexporter:latest channels -t ${token}  -g ${guildId} --include-threads

It works, but only the latest 20 threads are exported

image

Is there any way to force exporting the entire forum? I looked at the options in the docs but couldn't make it work.

Checklist

Tyrrrz commented 1 year ago

If you want to include archived threads, use --include-threads all. By default it's just active.

PGimenez commented 1 year ago

I repeated the export with that option and now I'm getting more threads. However, it does not export threads from before April 17. Is there any way to force a date range?

Tyrrrz commented 1 year ago

I repeated the export with that option and now I'm getting more threads. However, it does not export threads from before April 17. Is there any way to force a date range?

There should be nothing that limits how far back the threads can be fetched.

@slatinsky could it be something related to our recent optimizations? 🤔

slatinsky commented 1 year ago

I have tested code in DiscordClient.cs by adding before yield return thread this print (2x)

Console.WriteLine(currentOffset+ " "+ thread?.Parent?.Name+ " yield return"+ thread?.Name);

It correctly fetched all archived in active threads (currentOffset got over 2150 archived threads in a single guild channel and over 150 in active threads). Thread fetching logic in DiscordClient.cs should be correctly implemented.

Also a lot of new checks use boundaries, that OP didn't use. Those checks should be skipped if Before or After is Null (not verified)

my command:

DiscordChatExporter.Cli exportguild --guild 774897545717219328 --include-threads All --format Json --fuck-russia --markdown false --token <redacted> --output ../../../_my_test/
slatinsky commented 1 year ago

I have joined OPs discord server and used command DiscordChatExporter.Cli channels -t <redacted> -g 774897545717219328 --include-threads All

And I counted all posts in help-forum

It seems to export them correctly, but the order is not correct (channels exported by channels are not sorted by lastMessageId - that may have confused OP)

I used latest code from master branch built from source

Tyrrrz commented 1 year ago

Thank you @slatinsky!

PGimenez commented 1 year ago

Apparently it was an issue with the token I was using. I generated it for another bot, so I thought it would work. I tried with my account's token and it did extract all threads.

Thanks for checking, and sorry for the hassle!

slatinsky commented 1 year ago

Bot token uses different endpoint to fetch threads. Maybe it doesn't work only for bot tokens, I only tested thread export using user token

Tyrrrz commented 1 year ago

It would be weird that only some threads are not pulled, for bot token too. Unless there's some permission issue I'm not aware of.

pairofcrocs commented 11 months ago

I can confirm that personal tokens seem to be the only way to load all of the threads.