Closed cyri113 closed 1 year ago
Looking at the list of intents, I think these intents should be at least selected (https://discord.com/developers/docs/topics/gateway#list-of-intents).
GUILDS (1 << 0)
- CHANNEL_CREATE
- CHANNEL_UPDATE
- CHANNEL_PINS_UPDATE
- THREAD_CREATE
- THREAD_UPDATE
- CHANNEL_PINS_UPDATE
- THREAD_MEMBER_UPDATE
- THREAD_MEMBERS_UPDATE *
- STAGE_INSTANCE_CREATE
GUILD_MEMBERS (1 << 1) **
- GUILD_MEMBER_ADD
- GUILD_MEMBER_UPDATE
- THREAD_MEMBERS_UPDATE *
GUILD_MODERATION (1 << 2)
- GUILD_AUDIT_LOG_ENTRY_CREATE
- GUILD_BAN_ADD
GUILD_EMOJIS_AND_STICKERS (1 << 3)
- GUILD_EMOJIS_UPDATE
- GUILD_STICKERS_UPDATE
GUILD_INTEGRATIONS (1 << 4)
- GUILD_INTEGRATIONS_UPDATE
- INTEGRATION_CREATE
- INTEGRATION_UPDATE
GUILD_INVITES (1 << 6)
- INVITE_CREATE
GUILD_MESSAGES (1 << 9)
- MESSAGE_CREATE
- MESSAGE_UPDATE
GUILD_MESSAGE_REACTIONS (1 << 10)
- MESSAGE_REACTION_ADD
DIRECT_MESSAGES (1 << 12) **Not sure of this**
- MESSAGE_CREATE
- MESSAGE_UPDATE
- CHANNEL_PINS_UPDATE
DIRECT_MESSAGE_REACTIONS (1 << 13) **Not sure of this**
- MESSAGE_REACTION_ADD
GUILD_SCHEDULED_EVENTS (1 << 16)
- GUILD_SCHEDULED_EVENT_CREATE
- GUILD_SCHEDULED_EVENT_UPDATE
- GUILD_SCHEDULED_EVENT_USER_ADD
AUTO_MODERATION_CONFIGURATION (1 << 20)
- AUTO_MODERATION_RULE_CREATE
- AUTO_MODERATION_RULE_UPDATE
AUTO_MODERATION_EXECUTION (1 << 21) **Not sure of this**
- AUTO_MODERATION_ACTION_EXECUTION
I didn't add the delete part of each intent because I wansn't sure that we would need them. Ene could help us more for the intents @TjitsevdM .
Here are a few different comments:
Additional tasks: Should extract joined date per account Should extract profile picture per account (if we want to show this in the member tables)
Should we specify that the bot should get all messages from every thread in a channel?
Getting a list of all channels is to show in the dashboard which channels can be selected for analysis. Based on the channel permissions, channels where the bot doesn’t have access yet are marked so that the user can give the bot access there if they want these channels to be included in the analysis.
For the next features, extracting voice would be great if possible!
I'm not sure how the bot gets info about the different channels in a server. I don't know if we need to know when a channel gets removed or whether we can just read the channels that are present. Similarly for threads. That's the only thing I can think of for now for which we could need the delete part of the intents that are not listed by @amindadgar
I just looked more into the data that we saved in rawinfos table before and I can see we have these type of data
DEFAULT
REPLY
THREAD_CREATED
THREAD_STARTER_MESSAGE
RECIPIENT_ADD
CHANNEL_NAME_CHANGE
APPLICATION_COMMAND
CONTEXT_MENU_COMMAND
Also thinking about the permissions something that crossed my mind was that, we cannot assume the activities of deletion of messages, channels, etc a kind of activity that we have in analytics, but for modifications of channels, messages, etc we can assume an activity (please others give feedback for this).
based on my opinion in the paragraph above, I'll update my previous message.
In Notion we have this document about API fields. Then there is this requirement document from autumn 2022
Other things to consider:
Answering @katerinabc
channels
.Anyway here's additional permissions that the bot should have (in addition to my message above)
GUILD_MEMBERS (1 << 1) **
- GUILD_MEMBER_REMOVE
GUILDS (1 << 0)
- CHANNEL_DELETE
I'm starting to think that we have to write down the permissions that aren't needed rather than writing down those ones that are needed (Since they're becoming huge).
each day the bot code runs and it does the following steps: 1- get a list of guilds based on the Guild collection in the database
2- for each guild it will check whether the bot has access to it or not. if not the isDisconnected will be set to true for that guild.
3- for each guild the isInProgress field will be set to true
4- for each guild, it gets a list of channels (not synced with selected channels) then it checks each channel and updates that channel if the channel exists in the database, if not it will create a document for that channel
5- for each guild, it gets a list of members then it checks each member and updates that member if the member exists in the database, if not it will create a document for that member
6- this is the extraction msgs step(the most important part): the code here is not clear at all but it supposed to extract messages from the selected channels and their threads within the selected period and then inserting them to the rawinfo collection. beside that it contains error too. (using function that does not exist - the error will catch but the console log inside the catch block is commented)
7- for each guild the isInProgress field will be set to false
note: there is a optional way that the bot will run for a single guild too.
An additional step we'll soon need to add in #5:
We haven't designed the opt-in system yet so for now is just about making sure the architecture enables us to easily add this
when creating or updating documents in the database, the bot is doing it one by one, which can be slow and inefficient. For example, when using the create method to create multiple documents, each document is created and saved to the database one at a time, resulting in a significant amount of overhead.
This issue occurs multiple times in the bot, as creating and updating rawInfos, accounts, channels, and guilds is done one by one. Refactoring these sections of the code to use insertMany can significantly improve the bot's performance.
Overall, by refactoring the code to use insertMany instead of create, the bot's performance can be improved, reducing the overhead of multiple database operations and potentially reducing execution time.
We've encountered an issue!
At present, when users connect to a server, they choose channels and a specific period in the settings.
Our objective is to retrieve channel messages following the selected channels and period.
However, the problem lies in the fact that the Discord API does not offer a method to obtain messages based on a date. Instead, it allows for fetching messages before, after, or around a particular message, using the message ID.
While it's possible to filter messages based on the period locally, it might not be efficient to loop through a large number of messages (e.g., 10,000). Currently, our bot filters each message to determine if it falls within the desired period.
One suggested workaround is to disregard the user-selected period, retrieve all channel messages, and store them in a database. The analyzer app can then perform its task based on the period.
It's worth noting that, initially, we need to obtain all messages. This will provide us with the ID of the most recent message, enabling us to fetch messages after that ID for subsequent extractions.
Nima solution :
- Retrieve the initial 100 messages.
- Inspect the last message in the 100-element array and compare it to the period.
- If the message falls within the period, proceed to fetch an additional 100 messages sent before the previous last message.
- If the message is not within the period, iterate through the array and filter out messages that don't belong to the specified period (the fetch messages process will stop).
This method requires looping over an array with only 100 elements at most, just once.
solutions:
A) extract all messages and filter them based on the period then saved them into DB(bot currently does this) looping over an large array is not efficient considering Node.js architecture
B)extract all messages and saved them into DB saving a large amount of data in the DB.
C) the approach which @nimatorabiv suggested :extract needed messages (in the worst case it will extract extra 100 messages which will filter with a single loop over 100 elements) and saved them into DB. less API call and more efficient
The specific requirements will be move to the https://github.com/RnDAO/tc-discordBot repository.
Description
The bugs that are stopping us from moving the charts into production (mvp) come from incomplete extraction and processing of data from Discord. Overall, this repository doesn't follow best practices. The idea is to refactor this component, including tests and a set of requirements.
Tasks
GET /guilds/[{guild.id}/threads/active
) - what does it do with it?VIEW_CHANNEL
,READ_MESSAGE_HISTORY
) - do we persist this?GET /channels/[{channel.id}/messages
)reactions
from messageuser_mentions
from messagerole_mentions
from messagereplied_user
from messagethread
andthreadId
from messageNext
Documentation