strickvl / mlops-dot-systems

Quarto technical blog

4 stars 0 forks source link

posts/2024-06-03-isafpr-evaluating-baseline #5

Open utterances-bot opened 4 months ago

utterances-bot commented 4 months ago

Alex Strick van Linschoten - Evaluating the Baseline Performance of GPT-4-Turbo for Structured Data Extraction

I evaluated the baseline performance of OpenAI’s GPT-4-Turbo on the ISAF Press Release dataset.

https://mlops.systems/posts/2024-06-03-isafpr-evaluating-baseline.html

saeedesmaili commented 4 months ago

Very nice write up! Looking forward for the next posts in the series. I'm very much interested in learning how others approach evaluating the outputs of LLMs, specially in use cases like classifying texts or extracting structured data.

strickvl commented 4 months ago

Thanks @saeedesmaili! I'll be posting about the actual finetuning next.