Create a validation dataset for text models

redhat-et / foundation-models-for-documentation

Improve ROSA customer experience (and customer retention) by leveraging foundation models to do “gpt-chat” style search of Red Hat customer documentation assets.

Other

26 stars 12 forks source link

Create a validation dataset for text models #19

Closed Shreyanand closed 1 year ago

Shreyanand commented 1 year ago

Fixes #23

review-notebook-app[bot] commented 1 year ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codificat commented 1 year ago

Thanks for this! I just added a couple of minor comments about the code.

Fixes #4

It might also make #5 obsolete? Or at least partially - as it obtains basically the same data, probably in a cleaner form.

On the other hand, the obtained dataset does not include context, which is a problem for extractive QA. But I think that's fine if we focus on generative QA, where we'd also have to focus on other types of evaluation (beyond SQuAD).

Shreyanand commented 1 year ago

It might also make #5 obsolete? Or at least partially - as it obtains basically the same data, probably in a cleaner form.

We could focus that PR for creating a validation data for extractive methods.

Shreyanand commented 1 year ago

@codificat can we merge this if it looks good?