jogden17 / CapstoneProject

0 stars 0 forks source link

Selection and planning for NLP implementation #7

Open jogden17 opened 1 week ago

jogden17 commented 1 week ago

Decide how we will create and deploy a summarization tool using natural language processing

jogden17 commented 5 days ago

the best performing pre-trained nlp models I have tested have about a 1000 token limit, so we must either use a lower performing nlp summarization model which has a higher token limit, or we can use chunking to break up the text into 1000 token chucks and repeat the summarization until the final summary is reached. Another possibility is using a mixture of these two methods.

jogden17 commented 4 days ago

It seems like the transformers library will be the best implementation method for our purposes, but we still need to do some testing to find the best model and pipeline for our purposes.