This story exist to improved the webscraping functionality in the pipeline so that summarizations generated from watsonx.ai can be more accurate.
Things to keep in mind:
Currently the EventRegistry API does return a body item containing text from the article. However, this body is not always reliable as data not relevant to the article might also be included like ads.
A possible improvement might be to manually scrape the article via the supplied URL in the response data from the API
This story exist to improved the webscraping functionality in the pipeline so that summarizations generated from watsonx.ai can be more accurate.
Things to keep in mind:
body
item containing text from the article. However, this body is not always reliable as data not relevant to the article might also be included like ads.