Open HalaKuwatly opened 6 years ago
The page you sent works if you open all of the question tabs but the mechanism for that could change depending on the website so I don't really have a general solution for all websites like that. I was kind of hoping to leave that as a exercise for anyone looking for a challenge to try.
Add a bit of code the runs in the puppeteer tab instance before scraping:
Array.from(document.querySelectorAll('*')).filter(e => !['script', 'style', 'link', 'meta', 'embed', 'object'].includes(e.tagName.toLowerCase()) && getComputedStyle(e).display == 'none').forEach(e => e.style.display = 'initial');
On Tue, Jan 16, 2018 at 10:33 PM John Naylor notifications@github.com wrote:
The page you sent works if you open all of the question tabs but the mechanism for that could change depending on the website so I don't really have a general solution for all websites like that. I was kind of hoping to leave that as a exercise for anyone looking for a challenge to try.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jonaylor89/FAQ-Scrapper/issues/2#issuecomment-358187073, or mute the thread https://github.com/notifications/unsubscribe-auth/AM51XW3OkJFJWWwNcaTnItPjT_626Ukdks5tLWoPgaJpZM4ReqQj .
--
Cheers,
Darren Chan Full Stack Web/Applications Developer, VCU https://www.ts.vcu.edu/ 804-295-8945 <(804)%20295-8945>
Hey and thanks for the nice work!
Some pages that have a structure like this one: https://www.sskm.de/de/home/onlinebanking/tipps-und-hilfe/fragen_und_antworten/faq-elektronisches-postfach.html?n=true do not work. any idea why? or what can i change to make it work? Thanks