jshemas / openGraphScraper

Node.js scraper service for Open Graph Info and More!
MIT License
643 stars 102 forks source link

Support title & description fields #207

Closed ajmas closed 4 months ago

ajmas commented 5 months ago

Is your feature request related to a problem? Please describe.

Interested in having support for title & description fields

Describe the solution you'd like

Since not all sites are providing OpenGraph data, it would be useful if there was a fall back to the title and description fields, even if this is via an option. The idea being that if we are using the openGraphScraper to create a preview link we would want this information without needing to make a second separate request for this data.

I appreciate this isn't pure OG data, but it is related 'metadata'.

Describe alternatives you've considered

Writing my own code to fetch and parse the web page, which feels like a duplication of effort.

Also tried this:

await openGraphScraper({
  url,
  onlyGetOpenGraphInfo: true,
  fetchOptions: {
    headers: { 'accept-language': lang }
  },
  customMetaTags: ['title', 'description']
});

but this results in an error:

 errorDetails: TypeError: Cannot read properties of undefined (reading 'toLowerCase')
    at /User/ajmas/project/MyProject/node_modules/open-graph-scraper/dist/lib/extract.js:29:66
    at Array.forEach (<anonymous>)
    at Element.<anonymous> (/User/ajmas/project/MyProject/node_modules/open-graph-scraper/dist/lib/extract.js:28:20)
    at LoadedCheerio.each (/User/ajmas/project/MyProject/node_modules/cheerio/lib/api/api/traversing.ts:581:24)
    at extractMetaTags (/User/ajmas/project/MyProject/node_modules/open-graph-scraper/dist/lib/extract.js:23:15)
    at setOptionsAndReturnOpenGraphResults (/User/ajmas/project/MyProject/node_modules/open-graph-scraper/dist/lib/openGraphScraper.js:38:48)
   ...

Additional context Add any other context or screenshots about the feature request here.

jshemas commented 5 months ago

If you have onlyGetOpenGraphInfo set to true, it will only fetch open graph info. If you set it to false(the default) it should fall back on other meta tags.

What is the URL to the site that cause that error? I can look into it.

ajmas commented 5 months ago

Hi,

I'll take a look a bit later. I was actually trying a number of sites.

I think it may have been: https://www.earthfrequencies.org

On 2 Feb 2024, at 20:14, Josh Shemas @.***> wrote:

If you have onlyGetOpenGraphInfo set to true, it will only fetch open graph info. If you set it to false(the default) it should fall back on other meta tags.

What is the URL to the site that cause that error? I can look into it.

— Reply to this email directly, view it on GitHub https://github.com/jshemas/openGraphScraper/issues/207#issuecomment-1924993958, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFGSHOLHIAGVMN3EMCYNGTYRWFPVAVCNFSM6AAAAABCXRCRAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRUHE4TGOJVHA. You are receiving this because you authored the thread.

jshemas commented 4 months ago
const ogs = require('open-graph-scraper');
const options = { 
  url: 'https://www.earthfrequencies.org',
};
ogs(options)
  .then((data) => {
    const { error, result, response, html } = data;
    console.log('result:', result); // This contains all of the Open Graph results
  });

Will output:

result: {
  ogTitle: 'Earth Frequencies',
  ogLocale: 'en',
  charset: 'UTF-8',
  requestUrl: 'https://www.earthfrequencies.org',
  success: true
}

https://www.earthfrequencies.org/ does not have a ogTitle tag and is falling back to the title tag. This site does not have a description tag.

jshemas commented 4 months ago

I have fixed the toLowerCase error in open-graph-scraper@6.3.4. Follow https://github.com/jshemas/openGraphScraper?tab=readme-ov-file#custom-meta-tag-example when using customMetaTags.

jshemas commented 4 months ago

Closing this issue for now. Please reopen if you have other examples that should return OG data.