Closed irfan-yusanif closed 5 years ago
Are you sure you are reporting to the right repo? You don't even include AngleSharp.Js in your configuration.
Otherwise for the given page I see no problem. The page does not contain any element with the class fhlkh
. What it does is that the JS on the page starts a redirect (quite an efficient mechanism, but whatever ...), which leads to a page that contains some items (and elements with the CSS class that you are looking for).
HTH!
sorry for not including the AngleSharp.Js part in question. here is my code,
var context = BrowsingContext.New(Configuration.Default.WithJs());
var document = await context.OpenAsync(req => req.Content(pageLink));
var titleSelector = ".fhlkh";
var titlecells = document.QuerySelectorAll(titleSelector);
var titles = titlecells.Select(m => m.GetAttribute("href"));
on the given page, when i do
document.getElementsByClassName("fhlkh")[0].href
it gives the href link fine. But the above code does not return href links. can you please help?
Unfortunately I cannot. There are several reasons why your code may not work. The top two options are:
If you rely on this working I recommend you diving into the code, debugging the issue and coming up with a reason why the particular page is not working. If its a missing API we could solve it in AngleSharp.Js, if its a problem with Jint a PR in their repo may be helpful.
HTH!
Page I want to scrape: https://www.olx.com.pk/items
My code:
The QuerySelectorAll() gives empty list Note: The page to be scraped don't have jquery included.