Closed Drucifer2082 closed 2 years ago
btw running it now I get this error, can you check please?
Less code! Nice :) Let's run through it tomorrow and take next steps. I will move the meeting to a bit earlier so we have some time.
Sounds good. Ill be awake with coffee.
so, quick update.
for LSI to be effective. I need to populate the article_corpus as soon as the articles get called from the APIs. Was unsure if returning all of the functions as article would cause a catastrophic meltdown or would i need to pass in 3 different variables to the article_corpus = [article] ? To speed up the LSI analysis, running the LSI analysis on the lead paragraphs, content, and description on each API individually vs the entire texts reduced the similarity (in my small test data) by apx 2% (which when dealing with >75% similarity shouldn't be an issue. ) While I will still need to scrape the data from NYTimes page itself. It speeds up the initial topic mapping by staying within the APIs and not having to call each page from NYTimes and then comparing.