Add v2 of chapter 7 notebook which uses current haystack

tstadel commented 2 years ago

In the most recent version of Haystack, the evaluation of pipelines is much different compared to the version described in the (first edition of the) book. Pipeline now has a method eval(), which runs the pipeline given input data from labels and returns the prediction and the labels in the format of one pandas dataframe per pipeline node. That's why we created a new second version of chapter 7's notebook which makes use of the new Haystack evaluation mechanism. The results of the evaluation between the first and the second version differ because in the previous notebook version there were two bugs: one within the notebook itself during creating the labels that made some labels disappear and one within haystack's previous evaluation logic that caused handling no_answers (e.g. when there is no answer to a certain question) in a improper way.

This PR:

adds v2 of chapter 07: This contains updates of the haystack API and several bug fixes
adds new install_requirements option is_chapter7_v2
adds requirements-chpater7-v2.txt with farm-haystack==1.4.0

Misc:

starting with Haystack 1.4.0 ElasticsearchRetriever is now called BM25Retriever

review-notebook-app[bot] commented 2 years ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lvwerra commented on 2022-06-19T11:27:48Z ----------------------------------------------------------------

I would write: "Since version xyz of Haystack, ...". Same for the following mentions.

tstadel commented on 2022-06-21T17:00:54Z ----------------------------------------------------------------

Done.

lvwerra commented 2 years ago

Looks great to me, thanks for working on this! 🚀

I let @lewtun do the closer code review as he has mainly worked on this.

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lewtun commented on 2022-06-19T19:46:30Z ----------------------------------------------------------------

Maybe add your comment here that the ElasticsearchRetriever has been renamed to BM25Retriever ?

tstadel commented on 2022-06-21T17:01:08Z ----------------------------------------------------------------

Done.

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lewtun commented on 2022-06-19T19:46:31Z ----------------------------------------------------------------

Not sure if this is a rendering issue with ReviewNB, but please put the preds = ... code as a codeblock with triple quotes `

tstadel commented on 2022-06-21T17:01:16Z ----------------------------------------------------------------

Done.

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lewtun commented on 2022-06-19T19:46:32Z ----------------------------------------------------------------

nit: use DocumentSearchPipeline instead of raw text

tstadel commented on 2022-06-21T17:01:22Z ----------------------------------------------------------------

Done.

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lewtun commented on 2022-06-19T19:46:33Z ----------------------------------------------------------------

nits:

replace "the (first edition of the) book" with "the first edition of the book"
use "Pandas" instead of "pandas"

_tstadel commented on 2022-06-21T17:01:27Z_ ----------------------------------------------------------------

Done.

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lewtun commented on 2022-06-19T19:46:34Z ----------------------------------------------------------------

Line #1.    # Or if we want to calculate metrics for multiple topk values,

I think this code comment could be promoted to main text

_tstadel commented on 2022-06-21T17:01:41Z_ ----------------------------------------------------------------

Done.

review-notebook-app[bot] commented 2 years ago

View / edit / reply to this conversation on ReviewNB

lewtun commented on 2022-06-19T19:46:34Z ----------------------------------------------------------------

Maybe we can quantify how much the numbers differ by?

tstadel commented on 2022-06-21T17:01:54Z ----------------------------------------------------------------

Added the numbers of the previous plot

tstadel commented 2 years ago

Done.

View entire conversation on ReviewNB

tstadel commented 2 years ago

Done.

View entire conversation on ReviewNB

tstadel commented 2 years ago

Done.

View entire conversation on ReviewNB

tstadel commented 2 years ago

Done.

View entire conversation on ReviewNB

tstadel commented 2 years ago

Done.

View entire conversation on ReviewNB

tstadel commented 2 years ago

Done.

View entire conversation on ReviewNB

tstadel commented 2 years ago

Added the numbers of the previous plot

View entire conversation on ReviewNB

tstadel commented 2 years ago

@lewtun I made the respective changes. I found another bug regarding the logits plot in the Tokenize text for QA section. The plot now correctly depicts the labels :-)

nlp-with-transformers / notebooks

Add v2 of chapter 7 notebook which uses current haystack #48