Closed want-to-export-group closed 4 years ago
The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?
@want-to-export-group Was you able to resolve the issue?
I haven't seen that issue. Maybe it's a temporary network issue, you can look at the wget command and retry if that helps.
The test script works fine for "google-group-crawler-public" but fails for "google-group-crawler-public2" due to HTTP error 500. Could something be going wrong with the cookies?
Yes I can confirm this issue. Google has changed something to prevent our script from working :(
No I was not able to resolve it
Sent with ProtonMail Secure Email.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Thursday, April 9, 2020 1:03 PM, Ky-Anh Huynh notifications@github.com wrote:
@want-to-export-group Was you able to resolve the issue?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
:( it's used to work. Now accessing from the web browser also generates an error https://groups.google.com/forum/?_escaped_fragment_=categories/google-group-crawler-public2
By mistake google-group-crawler-public2
was set to private mode. Now it's fine. Btw, I have rewritten the script using curl hopefully it can help to resolve a few strange issue. Stay tuned.
The problem should be fixed in the latest version 2.0.0 (using curl). Please have a look if it's better. Thanks.
I am trying to archive messages from a large private group. The script seems to run fine, until the "Fetching data" step. Here is the output (the name has been changed to "group"):
:: Downloading all topics (thread) pages... :: Creating './group//threads/t.0' with 'categories/group' :: Fetching data from 'https://groups.google.com/forum/?_escaped_fragment_=categories/group'... --2019-12-20 13:16:16-- https://groups.google.com/forum/?_escaped_fragment_=categories/group Resolving groups.google.com (groups.google.com)... 2607:f8b0:400d:c0f::8a, 172.217.197.102, 172.217.197.113, ... Connecting to groups.google.com (groups.google.com)|2607:f8b0:400d:c0f::8a|:443... connected. HTTP request sent, awaiting response... 413 Request Entity Too Large 2019-12-20 13:16:16 ERROR 413: Request Entity Too Large.
As you can see, there is an Error 413. What is causing this, and how can it be fixed?