Closed KianKhadempour closed 1 year ago
If you want I can merge the two Instagram Chat commits into one and the two Colab commits into one so that the commit history is less messy.
Unfortunately I cannot make the colab file fit perfectly because it doesn't work with Jupyter Notebooks.
Sorry for the delay. I am currently on vacation, but I can fix these issues after.
I did not close this... Maybe because I deleted all the commits? Shoot. Sorry about that.
@joweich Would you rather a photo/video return the link or return "photo"/"video"? The reason I ask is because the wordcloud gets diluted with the name of the chat, which happens because the URI is something like:
messages/inbox/[chat name]_[chat id]/photos/[random number]_[random number]_n_[random number].jpg
Switching it to a single word would let you look for "video" or "photo" in the wordcloud to see how many times a video/photo was sent instead of seeing a bunch of numbers and the name of the chat. This also applies to shares. Should I change it to just outputting "share"?
After a bit of testing, I have concluded that not using the URI is the right choice, but putting a warning into the console after each skip happens way too often to be OK, so I am not going to add that.
To add on to this, one of the problems is that special characters (I.e. Ã) appear very often in the wordcloud. I fixed this by adding a min_word_length of 2 in the main.py file that I am using, but I think that this should be the default. I tried hard-coding it but it didn't work, so if someone could look into that it would be great.
I have fixed the special characters bug in #64, but I still think that the minimum characters should be 2.
I am not 100% sure if I covered all the edge cases, but it should work. Also, I couldn't figure out a way to use an f-string for the "{} sent {}'s story." part, so if anyone understands why that wasn't working for me I would appreciate it if you could fix it.