1Paint / GroupMe-Chat-History

An application that retrieves and downloads the chat histories of GroupMe users.
36 stars 20 forks source link

Large Files #2

Open pksmoothie opened 8 years ago

pksmoothie commented 8 years ago

Hey -- this code works great. I'm trying to bind a book together of this groupme for a wedding gift. This is the closest I've gotten.

One issue I'm running into is that with large output files (at least over 58 MB) the code will stop pulling messages and just repeat whatever date it got back to.

For instance, with my group, it gets back to August 30, 2013 and repeats that date and a couple surrounding it for what seems to be as long as the group actually is old, so hypothetically back until fall of 2011. Then, it will just pick up and start again on August 31, 2013 and continue normally until the current date.

I was using the executable file. Do you think it would work better within Python or this is a shortcoming on Groupme's API? I've tried smaller group chats and have gotten back to pre-2011.

Please let me know, and thanks again for this!

1Paint commented 8 years ago

Hey smoothie. I ran some tests and it seems that this happens because GroupMe's API says there are, say, 1000 messages in a chat but only 700 messages are stored. By the time the program reaches the 700th message, it'll grab the last 100 over and over until the message count reaches 1000.

This implies that GroupMe's API has a limit on the number of messages saved per chat. I'm going to try to contact GroupMe to see if this is the case.

The executable and Python scripts do the same thing so there's no need to switch over.

Also, 58 MB is huge! I'm thinking about dividing chat histories into sets so, for example, users can have 3 files of 20000 lines each instead of one large 60000 line file.

pksmoothie commented 8 years ago

Hey, that sounds exactly like what's happening, which is unfortunate. And yes, we've been a pretty active group for about four and a half years now.

Whether or not there is a solution to this, this is an unbelievable tool. Great work and thanks for your quick response.

1Paint commented 8 years ago

I actually just checked my code, and the program should stop running once it sees that there are no more messages to retrieve or if messages are repeating. So I'm not too sure what's happening.

Can you tell me the runtime for retrieving that 58 MB chat history?

Also, thanks for the positive feedback. I'll try to see what I can do.

pksmoothie commented 8 years ago

19 minutes, 7 seconds

1Paint commented 8 years ago

Hi smoothie. Sorry the the delay. I've been busy the last few weeks.

The error you had was due to encountering an HTTP Error and the application not terminating. It could have been due to being rate-limited from retrieving so many files over a certain period of time.

I updated the application to terminate retrieval when an HTTP Error is raised. The application then writes down details of the error into the chat history HTML file. Also written down are details of the chat history and the latest message ID. With this info, the application can fix a selected broken file via the 'Repair' button.

Note that the repair function will only work on chat histories retrieved using v1.1+ of the application. These chat histories will be compatible with a future update function that I will add.

Let me know if this fixes things.

pksmoothie commented 8 years ago

Hey, what's going on? I appreciate you still working on this. So, when I re-ran my chat, it did pop up with an error message. But, when I go to 'repair' the HTML file, I get the message, "Are you sure the chat history file is valid?"

1Paint commented 8 years ago

Alright, fixed it. It was an error only in the executable version.

As a reference:

Executables made using py2exe are unable to run linecache.getline() correctly.

1Paint commented 8 years ago

Last thing, what was the error code you encountered?

pksmoothie commented 8 years ago

ERROR: 500 msg: Internal Server Error chat_type: group chat_ID: 1673929 latest_message_id: 137781261980820327

I will try and re-run and let you know!

cdonohue413 commented 8 years ago

Hi there- how come everytime I get to the point where i put in my access token, it starts to load then before it starts retreiving, it says "The Chat ID Entered is Invalid, or there any no messages in the Chat". I know it's not empty, what is the issue? Any ideas? Thanks!

1Paint commented 8 years ago

Hi cdonohue413. Please go here to discuss the issue.