AngleSharp / AngleSharp.Js

:angel: Extends AngleSharp with a .NET-based JavaScript engine.
https://anglesharp.github.io
MIT License
103 stars 22 forks source link

QuerySelectorAll gives empty list #56

Closed irfan-yusanif closed 5 years ago

irfan-yusanif commented 5 years ago

Page I want to scrape: https://www.olx.com.pk/items

My code:

var config = AngleSharp.Configuration.Default.WithDefaultLoader();
          var document = await BrowsingContext.New(config).OpenAsync(pageLink);

          var titleSelector = ".fhlkh";
          var titlecells = document.QuerySelectorAll(titleSelector); //no results, empty list
          var titles = titlecells.Select(m => m.GetAttribute("href"));

The QuerySelectorAll() gives empty list Note: The page to be scraped don't have jquery included.

FlorianRappl commented 5 years ago

Are you sure you are reporting to the right repo? You don't even include AngleSharp.Js in your configuration.

Otherwise for the given page I see no problem. The page does not contain any element with the class fhlkh. What it does is that the JS on the page starts a redirect (quite an efficient mechanism, but whatever ...), which leads to a page that contains some items (and elements with the CSS class that you are looking for).

HTH!

irfan-yusanif commented 5 years ago

sorry for not including the AngleSharp.Js part in question. here is my code,

var context = BrowsingContext.New(Configuration.Default.WithJs());
                  var document = await context.OpenAsync(req => req.Content(pageLink));
                  var titleSelector = ".fhlkh"; 
                  var titlecells = document.QuerySelectorAll(titleSelector);
                  var titles = titlecells.Select(m => m.GetAttribute("href"));

on the given page, when i do document.getElementsByClassName("fhlkh")[0].href it gives the href link fine. But the above code does not return href links. can you please help?

FlorianRappl commented 5 years ago

Unfortunately I cannot. There are several reasons why your code may not work. The top two options are:

If you rely on this working I recommend you diving into the code, debugging the issue and coming up with a reason why the particular page is not working. If its a missing API we could solve it in AngleSharp.Js, if its a problem with Jint a PR in their repo may be helpful.

HTH!