Using Playwright or Pupperteer do not work for me with extractFromHtml()
I tried everything like:
Get the HTML to send to article-extractor
const contentHTML = await page.locator('html').innerHTML();
OR
const htmlPageContent = await page.content();
For testing where i use Playwright i did try with:
const contentExtractHtmlArticle = await articleExtractor(htmlPageContent);
I did a extract.ts file with a export fonction.
I tried this in my export page:
const { content } = await extractFromHtml(html);
OR
const { content } = await extractFromHtml(String(html));
I import my fonction where i use Playwright just to tested if i got a async await error.
But if i use from url in my export function, i tryed in the below section with extrac(url) and it did work.
but did not work when sending html:
export async function articleExtractor(html) {
try {
const { content } = await extractFromHtml(html);
// GC isHTML IN utils.js NOTE: nothing to do with the error i removed this isHTML part to test more.
if (isHTML(content)) {
console.log('HTML found');
return content ;
} else {
console.log('HTML NOT found');
return 'text/html not found';
}
} catch (error) {
console.error('Error extracting text from HTML:', error);
return 'Error extracting text from HTML';
}
}
Using Playwright or Pupperteer do not work for me with extractFromHtml()
I tried everything like:
Get the HTML to send to article-extractor
const contentHTML = await page.locator('html').innerHTML();
ORconst htmlPageContent = await page.content();
For testing where i use Playwright i did try with:
const contentExtractHtmlArticle = await articleExtractor(htmlPageContent);
I did a extract.ts file with a export fonction.
I tried this in my export page:
const { content } = await extractFromHtml(html);
ORconst { content } = await extractFromHtml(String(html));
I import my fonction where i use Playwright just to tested if i got a async await error. But if i use from url in my export function, i tryed in the below section with extrac(url) and it did work.
but did not work when sending html: