[twitter] accounts getting suspended

ForxBase commented 3 months ago

two accounts got suspended in a matter of days and downloading a few user profile's media. is this problem solvable? anyone else with this problem?

NonaSuomi commented 3 months ago

Been running into the same thing myself- lost three accounts over the last week and change.

I'm planning to let it rest a week or so before trying to make any more accounts, on the suspicion that my IP had been flagged for enhanced scrutiny. I figure if may be a temporary thing and if I back off for a little I might be okay to try again with a more cautious set of values for sleep and sleep-request.

I also suspect that time of use may be a factor, so I was planning to only schedule my script to only the extractor during local daytime hours, in case the scraper running overnight was tripping some kind of suspicious-activity alert.

a84r7a3rga76fg commented 3 months ago

Post your configuration. Switch IP address. Don't use timeline (it's broken anyway) if you're using it.

NonaSuomi commented 3 months ago

Maybe you can elaborate on what you mean here about not using timeline and your rationale? I see you have your own open issue regarding infinite sleep-request behavior that mentions it, but that doesn't really seem relevant to the problem of accounts being banned by Twitter.

And maybe you could also prove some more helpful advice on how to actually get an ISP to issue a new IP? As far as I'm aware most of them use DHCP, so a release+renew at the gateway will just pull the same address, and I have no interest in leaving my modem disconnected for long enough that the lease expires.

At any rate, at command line, I'm just using gallery-dl https://x.com/<account> on first-pass, and gallery-dl -o skip=abort:3 https://x.com/<account> on subsequent runs. Config, in relevant part (removed other extractor details):

{
    "extractor": {
        "archive": "<database>,
        "base-directory": <directory>,
        "path-extended": true,
        "user-agent": "browser",
        "retries":-1,
        "twitter":{
            "archive": <twitter_db>,
            "cookies": <twitter_cookie_file>,
             "filename": "{date:%Y%m%d_%H%M%S}-{tweet_id}-img{num:>02}.{extension}",
             "sleep": [5, 7.5],
             "sleep-request": [30, 35]
        }
    },
    "downloader": {
        "mtime": false
    },
    "output": {
        "shorten": "eaw"
    }
}

a84r7a3rga76fg commented 3 months ago

timeline is retrieving tweets from random users. abort doesn't work because it ignores those tweets. The process never ends because I waited days for the process of the input URL to end. When timeline did work, it was what always caused Twitter to impose rate limiting regardless of what my sleep and sleep-request timings were. Twitter bans you if your IP address is on their list while getting rate limited.

ForxBase commented 3 months ago

timeline is retrieving tweets from random users. abort doesn't work because it ignores those tweets. The process never ends because I waited days for the process of the input URL to end. When timeline did work, it was what always caused Twitter to impose rate limiting regardless of what my sleep and sleep-request timings were. Twitter bans you if your IP address is on their list while getting rate limited.

I just made a new account over another IP and used that IP to download from one single user. My account got suspended almost immediately after the download finished! I don't know what to do.

ericsia commented 3 months ago

same here, my account got suspended, twitter must have been noticing this

ForxBase commented 3 months ago

same here, my account got suspended, twitter must have been noticing this

twitter is unusable now and I can't make a new account for each user download...

06000208 commented 3 months ago

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then.

A related issue: #5775

ForxBase commented 3 months ago

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then.

A related issue: #5775

Anyone who doesn't get suspended? Is there nothing I can do?

overattackwatch commented 3 months ago

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then. A related issue: #5775

Anyone who doesn't get suspended? Is there nothing I can do?

I just kept appealing but I think they are ignoring me now so Cheers musk

ForxBase commented 2 months ago

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then. A related issue: #5775

Anyone who doesn't get suspended? Is there nothing I can do?

I just kept appealing but I think they are ignoring me now so Cheers musk

does it work for you now?

WarmWelcome commented 2 months ago

Hoping that a fix for this comes out soon. Have not been able to back up in a while and artists from Brazil have started to delete their accounts, or have already. Any help would be appreciated. Account got locked but not banned. sleep at 10-38 and sleep-request at 15-55.

NonaSuomi commented 2 months ago

That can't be it, or at least not the whole picture, because the first account I used with GDL, and consequently the first one I lost, was my daily-driver personal account.

Twi-Hard commented 2 months ago

I haven't used my account in weeks now and it's still been downloading 24/7 successfully. Even when I "used" my account I was just browsing art or looking for accounts to download without tweeting or retweeting.

WarmWelcome commented 2 months ago

I haven't used my account in weeks now and it's still been downloading 24/7 successfully. Even when I "used" my account I was just browsing art or looking for accounts to download without tweeting or and retweeting.

What region/country are you located in? Do you have 2fa activated, with phone or something? Trying to figure out the link between all of these inconsistent recommendations.

Twi-Hard commented 2 months ago

I'm in the west coast USA. I don't have 2FA enabled. I have a phone number on the account but it's a Google voice number which means it's a VOIP number which I assume is the type of number the bot people use. I have a dynamic IP and have never used a VPN or proxy with this specific account. Before Elon Musks takeover I filled a 14TB drive with the drive speed being the bottleneck which means I was going extremely fast. I say that because that surely should have caused some red flags on their end. Unfortunately almost none of those accounts were relevant so I switched to whitelisting which accounts I download. I've never used Twitter to tweet, retweet or like ever. I've used it to DM people and browse art. The account is 5 years old. It's also a developer account but I doubt that matters.

I said the following in a related issue (https://github.com/mikf/gallery-dl/issues/5775)

Anyone who doesn't get suspended?

I download Twitter 24/7 with a low sleep setting. Before I made the time between each request random (using a range) I had to have a lot higher sleep between each request. This is using my home IP with my actual twitter account I use. After I lowered the sleep and made it a range I've recieved no rate limiting at all. Before the change I'd get told to wait until a certain time before continuing. This is a little confusing to me because I'm probably making a lot more requests per day than Elon allocates to free accounts. I'm not home so I can't tell you the settings yet. I actually copied them from somewhere else in gallery-dl's issues.
-o "sleep=[1.5,5]"
-o "sleep-request=[6.0,12.0]"

WarmWelcome commented 2 months ago

I'm in the west coast USA. I don't have 2FA enabled. I have a phone number on the account but it's a Google voice number which means it's a VOIP number which I assume is the type of number the bot people use. I have a dynamic IP and have never used a VPN or proxy with this specific account. Before Elon Musks takeover I filled a 14TB drive with the drive speed being the bottleneck which means I was going extremely fast. I say that because that surely should have caused some red flags on their end. Unfortunately almost none of those accounts were relevant so I switched to whitelisting which accounts I download. I've never used Twitter to tweet, retweet or like ever. I've used it to DM people and browse art. The account is 5 years old. It's also a developer account but I doubt that matters.

I said the following in a related issue (#5775)
Anyone who doesn't get suspended?

I download Twitter 24/7 with a low sleep setting. Before I made the time between each request random (using a range) I had to have a lot higher sleep between each request. This is using my home IP with my actual twitter account I use. After I lowered the sleep and made it a range I've recieved no rate limiting at all. Before the change I'd get told to wait until a certain time before continuing. This is a little confusing to me because I'm probably making a lot more requests per day than Elon allocates to free accounts. I'm not home so I can't tell you the settings yet. I actually copied them from somewhere else in gallery-dl's issues.
-o "sleep=[1.5,5]"
-o "sleep-request=[6.0,12.0]"

Im guessing that that is as a result of having a legacy twitter developer account... ugh. Might have to give applying for one a shot, but they watch what you do and I dont even know what I would say in the 250 character application.

WarmWelcome commented 2 months ago

I applied for the API thing and immediately got access to the developer section, so apparently it isn't a "wait for approval" type application. Going to give it a try sometime later, but I still have hard API limits.

ericsia commented 2 months ago

I suspect that twitter does not delete accounts if it seems like a human user uses it regularly. What I suspect is going on is that gallery-dl activity is getting flagged as a possible bot activity and needs further investigation. Then if there is no evidence of typical user activity, such as tweeting, retweeting, etc. it deletes account. However if there is evidence of typical user activity, it makes the user do the arkrose challenge. At least, that is what I noticed with my gallery-dl use on my accounts.

I might be wrong, and I do not know for sure. Personally, I have not lost an account yet, but the accounts that I use gallery-dl on are also accounts that I use weekly. Though these days, I always get a arkrose challenge after a gallery-dl run. I also do not know if there are any actions or activity that would get a account deleted that a gallery-dl user might also be doing. So what I am saying could be a detrimental task. Try it on your account at your own risk.

I think your account will get banned soon as well. Afterall, we all do the arkose challenge before getting banned. Now my new account can't even login using cookie option

Joebugg commented 2 months ago

Wow, they really really want you to use a browser to scrape with? Seems kind of backwards.

So if you use Selenium, what happens? I'm wondering if they're using a JS version of a canary warrant. There's code you can run on the server that expects certain responses (i.e. anti-adblocker methods), for example. Not sure the "order of headers" is too useful now, since browsers all seem to be moving to randomizing the order. Do people get these same bans if they use a userscript method?

I'm actually curious what they're doing that triggers the bans. Of course, they don't want you to know. ;) Watch it be something stupidly simple because they only have to get it right, once.

docholllidae commented 2 months ago

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",

            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],

            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],

            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },

            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

ForxBase commented 2 months ago

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",

            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],

            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],

            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },

            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

I tried this from your config but didn't work it doesn't log in "authentication required"

{ "extractor": { "base-directory": "D:\gallery-dl", "path-restrict": "^A-Za-z0-9_.~!-", "skip": "abort:3", "keywords-default": "",

    "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",

        "cookies-from-browser": "firefox",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],

        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": true,
        "include": ["avatar","background","media","timeline"],

        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },

        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

}

ForxBase commented 2 months ago

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",

            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],

            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],

            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },

            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

but with this value you use

"sleep": [24.9, 45.2], "sleep-request": [23.8, 52.6],

it takes forever to download one entire user profile!

docholllidae commented 2 months ago

I tried this from your config but didn't work it doesn't log in "authentication required"

{ "extractor": { "base-directory": "D:\gallery-dl", "path-restrict": "^A-Za-z0-9_.~!-", "skip": "abort:3", "keywords-default": "",

    "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",

        "cookies-from-browser": "firefox",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],

        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": true,
        "include": ["avatar","background","media","timeline"],

        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },

        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

}

are you using a cookies file? twitter requires logging in to view most profiles nowadays create a cookie file and point to that.

I use an chrome extension Open Cookies.txt, install that and then log into twitter on your desktop browser. click the extension and if it requests permission to read your data on Twitter say Grant Access then choose the Raw Cookies.txt option and highlight everything in the resulting text block and copy-paste it into a file

but with this value you use

"sleep": [24.9, 45.2], "sleep-request": [23.8, 52.6],

it takes forever to download one entire user profile!

yes it can take some time; i haven't played much with the sleep times but they can probably go lower without risk to the account being banned it takes me about 3 days to re-download almost 500 profiles with the "skip": "abort:3" option set (about 10min per profile), after not running it for a while so there was more media than normal for me to grab. run my same inputs again with lower sleep/sleep-request values

docholllidae commented 2 months ago

also, my config for twitter will download text tweets too, and makes a json file for all tweets. if you don't want that you can easily edit them out of the config

overattackwatch commented 2 months ago

also, my config for twitter will download text tweets too, and makes a json file for all tweets. if you don't want that you can easily edit them out of the config

I havent really used cookies before, am I doing this right? my browser is operagx, I can switch if needed I have other browsers installed

code pasted into .conf file with twittercookie.txt being a copy paste of what I got from rawcookies.txt


        },

           "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",

        "cookies": "C:\Users\UserProfile\AppData\Roaming\gallery-dl\twittercookie.txt",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],

        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": false,
        "include": ["avatar","background","media","timeline"],

        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },

        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

I then do a basic download such as py -3 -m gallery_dl -D C:\Users\downloadtempname\ https://x.com/tempname

And I get a response of [twitter][info] Requesting guest token [twitter][error] AuthorizationError: Login required

docholllidae commented 2 months ago

can you run it again but add -v at the end of the command then paste the verbose output

overattackwatch commented 2 months ago

can you run it again but add -v at the end of the command then paste the verbose output

C:\Users\>py -3 -m gallery_dl -D C:\Users\downloadtempname\ https://x.com/tempname -v
[gallery-dl][debug] Version 1.27.1
[gallery-dl][debug] Python 3.12.2 - Windows-11-10.0.22631-SP0
[gallery-dl][debug] requests 2.32.3 - urllib3 2.2.0
[gallery-dl][debug] Configuration Files []
[gallery-dl][debug] Starting DownloadJob for 'https://x.com/tempname'
[twitter][debug] Using TwitterUserExtractor for 'https://x.com/tempname'
[twitter][debug] Using TwitterTimelineExtractor for 'https://x.com/tempname/timeline'
[twitter][info] Requesting guest token
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): api.x.com:443
[urllib3.connectionpool][debug] https://api.x.com:443 "POST /1.1/guest/activate.json HTTP/1.1" 200 63
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): x.com:443
[urllib3.connectionpool][debug] https://x.com:443 "GET /i/api/graphql/k5XapwcSikNsEsILW5FvgA/UserByScreenName?variables=**Listed information like what is set to true or false in the conf** HTTP/1.1" 200 1040
[urllib3.connectionpool][debug] https://x.com:443 "GET /i/api/graphql/tO4LMUYAZbR4T0SqQ85aAw/UserMedia?variables=%7 **Listed information like what is set to true or false in the conf** HTTP/1.1" 404 0
[twitter][debug] API error: 'Unspecified'
[twitter][error] AuthorizationError: Login required

mikf commented 2 months ago

[gallery-dl][debug] Configuration Files []

You config file is not getting loaded. Make sure it is at one of the locations listed here or by gallery-dl --config-status.

overattackwatch commented 2 months ago

[gallery-dl][debug] Configuration Files []
You config file is not getting loaded. Make sure it is at one of the locations listed here or by gallery-dl --config-status.


C:\Users\Userprofile>gallery-dl --config-status
[config][error] JSONDecodeError when loading 'C:\Users\Userprofile\gallery-dl\config.json': Invalid \escape: line 70 column 23 (char 2234)
C:\Users\Userprofile\AppData\Roaming\gallery-dl\config.json : Not Present
C:\Users\Userprofile\gallery-dl\config.json                 : Invalid JSON
C:\Users\Userprofile\gallery-dl.conf                        : Not Present

C:\Users\Userprofile>py -3 gallery-dl --config-status
C:\Users\Userprofile\AppData\Local\Programs\Python\Python312\python.exe: can't find '__main__' module in 'C:\\Users\\Userprofile\\gallery-dl'

mikf commented 2 months ago

[config][error] JSONDecodeError when loading 'C:\Users\Userprofile\gallery-dl\config.json': Invalid \escape: line 70 column 23 (char 2234)
    "cookies": "C:\Users\JamesD\AppData\Roaming\gallery-dl\twittercookie.txt",

You can't use single backslashes for filesystem paths in a JSON file. You need to either double them \\ or replace them with forward slashes /.

"cookies": "C:\\Users\\JamesD\\AppData\\Roaming\\gallery-dl\\twittercookie.txt",
"cookies": "C:/Users/JamesD/AppData/Roaming/gallery-dl/twittercookie.txt",

overattackwatch commented 2 months ago

C:\Users\UserProfile>gallery-dl --config-status
[config][error] JSONDecodeError when loading 'C:\Users\UserProfile\gallery-dl\config.json': Expecting ',' delimiter: line 103 column 2 (char 3653)
C:\Users\UserProfile\AppData\Roaming\gallery-dl\config.json : Not Present
C:\Users\UserProfile\gallery-dl\config.json                 : Invalid JSON
C:\Users\UserProfile\gallery-dl.conf                        : Not Present

Hrxn commented 2 months ago

Your config file is still not valid JSON, therefore it is not loaded/used at all. Use a site like https://www.jslint.com/ for example to fix your JSON if your editor can't do that.

overattackwatch commented 2 months ago

Your config file is still not valid JSON, therefore it is not loaded/used at all. Use a site like https://www.jslint.com/ for example to fix your JSON if your editor can't do that.

Thanks for the site, I had the json ending with

            }]            
}

fixed when I changed it to

ForxBase commented 2 months ago

I tried this from your config but didn't work it doesn't log in "authentication required" { "extractor": { "base-directory": "D:\gallery-dl", "path-restrict": "^A-Za-z0-9_.~!-", "skip": "abort:3", "keywords-default": "",
    "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",

        "cookies-from-browser": "firefox",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],

        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": true,
        "include": ["avatar","background","media","timeline"],

        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },

        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}
}
are you using a cookies file? twitter requires logging in to view most profiles nowadays create a cookie file and point to that.

I use an chrome extension Open Cookies.txt, install that and then log into twitter on your desktop browser. click the extension and if it requests permission to read your data on Twitter say Grant Access then choose the Raw Cookies.txt option and highlight everything in the resulting text block and copy-paste it into a file

but with this value you use "sleep": [24.9, 45.2], "sleep-request": [23.8, 52.6], it takes forever to download one entire user profile!

yes it can take some time; i haven't played much with the sleep times but they can probably go lower without risk to the account being banned it takes me about 3 days to re-download almost 500 profiles with the "skip": "abort:3" option set (about 10min per profile), after not running it for a while so there was more media than normal for me to grab. run my same inputs again with lower sleep/sleep-request values

Thanks. With these values

"sleep": [24.9, 45.2], "sleep-request": [23.8, 52.6],

it takes you three days to download 500 profiles? How? I lowered them, didn't get banned yet.

docholllidae commented 2 months ago

Thanks. With these values

"sleep": [24.9, 45.2], "sleep-request": [23.8, 52.6],

it takes you three days to download 500 profiles? How? I lowered them, didn't get banned yet.

i updated to these values:

            "sleep": [12.9, 31.2],
            "sleep-request": [11.8, 35.6],

and got through 400 profiles in about 16hr i simply am being overly cautious of the twitter timeouts when i first started with sleep times of something like 5s i would frequently be forced to prove i'm human . i'm actually surprised it never resulted in a ban considering how often it happened

overattackwatch commented 2 months ago

and got through 400 profiles in about 16hr i simply am being overly cautious of the twitter timeouts when i first started with sleep times of something like 5s i would frequently be forced to prove i'm human . i'm actually surprised it never resulted in a ban considering how often it happened

Same used to get prove your human alot now never, Its weird. the account I use to download is even suspended and it just doesnt care

WarmWelcome commented 1 month ago

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",

            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],

            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],

            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },

            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

I stole a great deal of this and it works pretty damn well. One question though, what is syndication? Cant find it in the configuration docs here: https://gdl-org.github.io/docs/configuration.html

mikf commented 1 month ago

what is syndication?

It was a workaround to download age-restricted content without login, back when you could use Twitter as guest user. 1171911dc3c8c739f8eac1e16a42bfd53cce6ac7, 92ff99c8e55910ecb0c91d7cac67c76a336324dd

wankio commented 1 month ago

i'm using main twitter account, it dont even getting suspended for many years, but the one thing is rate limit lol. if you using clone account, etc.. it's highly getting suspended

mikf / gallery-dl

[twitter] accounts getting suspended #6020