j0k3r / graby

Graby helps you extract article content from web pages
MIT License
367 stars 74 forks source link

lesswrong.com workaround #199

Closed Najrim closed 5 years ago

Najrim commented 5 years ago

See comment on commit. Based on issue #3767 in Wallabag repo.

coveralls commented 5 years ago

Coverage Status

Coverage remained the same at 97.781% when pulling 28472042908f02e4079550a7f534be4b15eab8ae on Najrim:patch-1 into 6c1506f19d1bb876fb26bfc11e3e11a993eb7fe8 on j0k3r:master.

techexo commented 5 years ago

I am interested to see if this PR will be accepted, because some other websites would require such a workaround. I think @j0k3r didn't want to do such modifications in the core code of graby.

Maybe there is a way to include them not in the core source code, but in graby's configuration on wallabag (maybe here?. I think only @j0k3r will be able to tell us.

j0k3r commented 5 years ago

Yeah that's a good question. I would like to work on a way to use a configuration to add or remove this kind of rewrite_url pattern. Or maybe this should goes into site config instead of hard coded in the coded. A configuration of graby for that might be a good idea but what if an other website than wallabag wants the same behavior for lesswrong? It'll have to add it inside the configuration of graby too.

What do you think @fivefilters?

fivefilters commented 5 years ago

Yes, I agree @j0k3r. I think it'd be better if it was handled in the site config file. I've been thinking about this too as there are sites where adding a parameter saves an uncessary redirect or avoids a GDPR/cookie wall from being displayed. So perhaps in addition to rewrite_url, we could also have a way to add/remove query string parameters.

Najrim commented 5 years ago

Alright, there seems to be consensus that this should be in a configuration file instead so I will close the pull request for now. My reason for proposing the change is just personal convenience. I'm not self-hosting at the moment so I can't fix this locally.