icy / google-group-crawler

[Deprecated] Get (almost) original messages from google group archives. Your data is yours.
215 stars 38 forks source link

Started to fail when using cookie with "error trying to read config from the 'curl-options.txt' file" #38

Open Zedseayou opened 4 years ago

Zedseayou commented 4 years ago

Hi, thank you for writing this script!

I noticed that downloading from private groups appears to have broken for some reason. The included test files fail on _test_public_2_with_cookie() as well. The error is:

Warning: error trying read config from the 'curl-options.txt' file
:: Unable to find any mail messages from ./viettug.org-google-group-crawler-public2//mbox/

It seems like this error is being thrown by curl itself? I'll try to look into it some more but not the most familiar with these tools. I have curl 7.38.0 since I am on an old version of Debian, but it doesn't seem like either the user-agent or the header options have changed and this was working a few weeks ago.

icy commented 4 years ago

@Zedseayou thanks for your feedback and using my script.

Warning: error trying read config from the 'curl-options.txt' file

Do you have anything in the curl-otions.txt? It's cookie file used by curl, and it's supposed to be generated by yourself (https://github.com/icy/google-group-crawler#private-group-or-group-hosted-by-an-organization).

icy commented 4 years ago

Please note that, cookie generated from your browser has quite short TTL (time to live). You likely have to generate that file after a few days (I don't know exactly)

icy commented 4 years ago

_test_public_2_with_cookie

The test is designed to work with Travis CI system (https://github.com/icy/google-group-crawler/blob/master/.travis.yml). I'm sure the tests would be broken now because cookie data are out of date. Everytime I need to execute the tests, I have to update those data. In the source you will that file encrypted (https://github.com/icy/google-group-crawler/blob/master/tests/curl-options.txt.enc), and you can't decrypt them from your laptop ;)

Zedseayou commented 4 years ago

I did have updated values in curl-options.txt when I ran this, but I haven't tried again recently. Will follow up and let you know