Rankings not read correctly for some keywords, >100 returned

jasalt commented 2 months ago

While testing the application, I noted that some keywords don't return their correct ranking but return >100 instead, even though the right hand side-panel inspector shows that site appears earlier in results.

Example screenshot:

This seems to happen quite often, maybe 40% of times. Domain name characters in example are bit messed up because ä in url, but problem is the same with all domains I tested. Same result also when using either Scraping Robot or Serply, so thinking maybe there is a configuration issue or something.

Not very familiar with TypeScript, wondering what would be correct way to debug further.

jasalt commented 2 months ago

Seeing if I can make some sense of this, running it locally and online to differentiate better if it's just my setup issue.

In some cases there is a log error, but UI shows that scrape returns fine with >100 value:

[0] GET /api/keywords?domain=example.com
[0] [ERROR] Scraping Keyword :  example com domain keyword . Error:  undefined
[0] [ERROR_MESSAGE]:  TypeError: Cannot read properties of undefined (reading 'load')
[0]     at extractScrapedResult (/app/.next/server/chunks/941.js:313:63)
[0]     at scrapeKeywordFromGoogle (/app/.next/server/chunks/941.js:278:100)
[0]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
[0]     at async refreshAndUpdateKeyword (/app/.next/server/chunks/941.js:79:34)
[0]     at async refreshAndUpdateKeywords (/app/.next/server/chunks/941.js:59:37)
[0] [SUCCESS] Updating the Keyword:  example com domain keyword
[0] time taken: 18342.214424001053ms

jasalt commented 2 months ago

I added some debug prints before const $ = cheerio.load(content); in utils/scraper.ts and it seems content does include html. Confused about the undefined error, but not too familiar how TS works here.

OlliZabal commented 3 weeks ago

I can confirm this issue running serpbear in a locale docker environment.

towfiqi commented 2 weeks ago

@OlliZabal Please share the domain/URL and the keyword that's having this issue. Also mention the Country, device and the scraper you are using.

OlliZabal commented 2 weeks ago

@towfiqi In my case it's the same issue as mentioned here #237 which hopefully is solved then. You have merged the commit into main - many thanks.

chester-e commented 2 weeks ago

@towfiqi good evening. thank you for the latest update. i just installed 2.0.4 but unfortunatelly i also have alot of keywords which do rank high but serpbear shows them a >100 and in the docker logs i see

[0] START SCRAPE:
[0] GET /api/keywords?domain= [0] [ERROR] Scraping Keyword : f . Error: undefined [0] [ERROR_MESSAGE]: TypeError: Cannot read properties of undefined (reading 'load') [0] at extractScrapedResult (/app/.next/server/chunks/941.js:312:63) [0] at scrapeKeywordFromGoogle (/app/.next/server/chunks/941.js:277:100) [0] at process.processTicksAndRejections (node:internal/process/task_queues:105:5) [0] at async refreshAndUpdateKeyword (/app/.next/server/chunks/941.js:79:34) [0] at async refreshAndUpdateKeywords (/app/.next/server/chunks/941.js:59:37) [0] [SUCCESS] Updating the Keyword:

i'm using scraping robot

thank you!

towfiqi commented 1 week ago

@chester-e I Just added your domain and the keyword "fitness holzkirchen" (country: Germany, device: Desktop) to my serpbear instance and set the scraper to scraping robot. It fetched the position just fine. No error of any kind.

mrcloudbook commented 1 week ago

we are facing same issue for all domains we are using for Serpbear

towfiqi commented 1 week ago

@mrcloudbook What scraper are you using?

chester-e commented 1 week ago

@towfiqi i was playing around with it quite alot. removed, readded keywords etc. removed the domain, readded it. is there a way to clean up the database or something like that? later that day i will deploy a fresh docker instance and see if the error is gone. at least there is an error message right? :-)

mrcloudbook commented 1 week ago

@towfiqi Iam using scraping robot

towfiqi commented 1 week ago

@mrcloudbook Can you please share your domain, and some of the keywords with their country and device settings? I will try to recreate the issue. And kindly make sure your ScrapingRobot API has monthly credit left.

mrcloudbook commented 1 week ago

@towfiqi Domain : https://mrcloudbook.com keyword : mrcloudbook country : India

towfiqi commented 1 week ago

@mrcloudbook Just added the domain and the keyword and scraped with scrapingrobot without any issues. How are running the app? Can you please check the logs?

chester-e commented 1 week ago

@towfiqi fresh docker container, same issue. same scraping robot account, serpbear 2.0.4 !

mrcloudbook commented 1 week ago

@towfiqi Iam using Dokcer and API from Scrap robot

Here are logs

[0] GET /api/keywords?domain=mrcloudbook.com
[0] [ERROR] Scraping Keyword :  mrcloudbook . Error:  undefined
[0] [ERROR_MESSAGE]:  TypeError: Cannot read properties of undefined (reading 'load')
[0]     at extractScrapedResult (/app/.next/server/chunks/941.js:312:63)
[0]     at scrapeKeywordFromGoogle (/app/.next/server/chunks/941.js:277:100)
[0]     at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[0]     at async refreshAndUpdateKeyword (/app/.next/server/chunks/941.js:79:34)
[0]     at async refreshAndUpdateKeywords (/app/.next/server/chunks/941.js:59:37)
[0] [SUCCESS] Updating the Keyword:  mrcloudbook
[0] time taken: 12203.406651005149ms
[0] GET /api/keywords?domain=mrcloudbook.com

jasalt commented 1 week ago

Just verifying the same issue remaining after update here.

Gave 2.0.4 a quick try, freshly installed with Docker Compose (on Coolify managed VPS). Using Scraping robot, verified there are credits in panel, and their demo query works.

Using:

Domain : https://mrcloudbook.com/
keyword : mrcloudbook
country : India
device: desktop

Same error error as above still, result is >100, eventhough that site is first in regular search results at least over here.

Docker Compose file copied from docs, just in case if there's something wrong with that:

version: '3.8'
services:
  app:
    image: towfiqi/serpbear
    restart: unless-stopped
    ports:
      - '3000:3000'
    environment:
      - USER=admin
      - PASSWORD=XXXX
      - SECRET=XXXX
      - APIKEY=XXXX
      - 'NEXT_PUBLIC_APP_URL=https://example.com'
    volumes:
      - 'serpbear_appdata:/app/data'
networks:
  my-network:
    driver: bridge
volumes:
  serpbear_appdata: null

towfiqi commented 1 week ago

Could recreate the bug. This happened because on new installs cheerio's latest version was being installed which has a different import method. Fixed in version v2.0.5. Please pull the latest version.

mrcloudbook commented 1 week ago

Working fine from myside Thanks @towfiqi

chester-e commented 1 week ago

@towfiqi looking good! thanks mate! would be glad to buy you a beer at least :-)

towfiqi commented 1 week ago

Great to hear that :)

towfiqi / serpbear

Rankings not read correctly for some keywords, >100 returned #243