1RedOne / 1redone.github.io

FoxDeploy's GitHub.io Page
https://1redone.github.io/
MIT License
6 stars 3 forks source link

Extracting and monitoring web content with PowerShell #39

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

Extracting and monitoring web content with PowerShell

FoxDeploy.com, Stephen Owen's technical blog about PowerShell, Systems Administration, GUI Design and Programming. .

https://www.foxdeploy.com/blog/extracting-and-monitoring-web-content-with-powershell.html

claudiutudose commented 3 years ago

Hey Stephen,

Great article!!! I found it useful and something similar with what I have in plan to do, and I would want to ask you as I don't have experience in this domain. Is there a way to create a code like the one you created above for PowerSheel that perform a css requirement?

For example, I want to hide a section from a specific website. The css code is "display: none;" for that section that has a specific class or id.

Thank you and I look forward to hearing from you!

1RedOne commented 3 years ago

Happy you liked the blog post! Can you help me understand the full requirement?

Do you control the web page? If you do, you should control what elements appear by altering the css on the site, or using JavaScript to determine when to hide or show an element.

If you don't...tell me what the script would do.

claudiutudose commented 3 years ago

Hello Stephen,

I apologize for delay!

No, I don't have control over it. It can be any website on the internet. My idea is this! I want to customize sections of different websites like stylebot https://chrome.google.com/webstore/detail/stylebot/oiaejidbmkiecgbjeifoejpgmdaleoha?hl=ro or adblock plugins do for Chrome, and I want to create a code similar with the one you created in this article, so that whenever I startup the system(windows os), the code automatically start and do its job to make the css customization translated into that powershell code, without having to install the third party plugins like stylebot.

Thank you!

1RedOne commented 3 years ago

You could do this by newing up a webkit or ie object and then editing or manipulating the DOM (document object model, the parsed view of the webpage's html) but it would be a big ask and not really a good use for PowerShell.

It would be much better, IMHO, to do this as a web browser extension.

claudiutudose commented 3 years ago

Ok Stephen, thank you for your help! If you know some references or tutorials using this method you described above to create such a code, that would be great! Thanks!

Coldmiser commented 2 years ago

With the introduction of PS7, the Invoke-WebRequest function no longer produces the ParsedHtml method (which is a shame because I have to parse a webpage exactly like you demonstrate).

Is there a way to redo with using PowerShell 7?

For example, I'm looking to see when the last modified date from https://www.virtualbox.org/ticket/20536 was.

rosamund commented 1 year ago

Hi Stephen, I read your post with interest and need your help with my work. I want to extract the data contained in "impressum" of my websites. For example the url "https://kathrein.at/impressum". I did a right click inspect and the data I want is contained in div class="content-text" i.e. company name, address, telephone and email.

I entered the following code but it doesn't work: Invoke-WebRequest -UseBasicParsing https://kathrein.at/impressum $rep.ParsedHtml.body.getElementsByClassName('content-text')| select -expand innertext

Can You help me correct This script and tell me for a list of urls how to get the same data in a loop.

Thank you