j0k3r / graby

Graby helps you extract article content from web pages
MIT License
363 stars 73 forks source link

Site config with multiple dependent login_extra_fields #271

Open ghost opened 2 years ago

ghost commented 2 years ago

I'm trying to authenticate with a website that requires two additional fields (_csrf and login_ticket, both changing with each request). The following naïve solutions disregards the dependence of the values:

login_extra_fields: login_ticket=@=xpath('//input[@name="login_ticket"]/@value', request_html('https://id.sueddeutsche.de/login'))
login_extra_fields: _csrf=@=xpath('//form[@id="login-form"]//input[@name="_csrf"]/@value', request_html('https://id.sueddeutsche.de/login'))

I couldn't find a configuration-based approach that allows implementing either caching the downloaded page or setting both values at once. Did I miss something here? If not, what's the best way to go about this in your opinion?