Open amnpawar opened 1 year ago
Dear amnpawar,
For the first question: The value was mainly determined by close reading of the results. Also, the confusion matrix helped (which answers your last question :-). For the second question: I am nut sure if I understand correctly but for me, close reading of the results is crucial in order to understand the outcome and in order to adapt the code.
Best, Sarah
Hey @soberbichler thank you for your response but I still didn't get close reading of results like how it help you determine the threshold can you throw some light over it with a small example.
Actually second question is somewhat related to 1st and changing the threshold varies the results like as below chunk sum(most_similar_df['relevancy']) > 17:
is deciding too which fields to move to relevant and which one to non-rev . Altering something else in place of 17 alters complete result which you denoted via result_right = len(non_rev_0) + len(rev_3)
@soberbichler Can you please throw some light on the most_similar_df factor where you have taken 17 as a threshhold value for filtration So incase I execute it over different dataset how can i determine this value what should I take Line [34] in notebook
sum(most_similar_df['relevancy'])> 17:
Also in case we alter the value check for
sum(most_similar_df['relevancy'])
to any value greater it also affects Non_rev_0 values i mean they increase which increase result_right value for us. So how come I can be assured that the values in Non_rev_0are the ones which do have a relevancy for me.Also if I am not wrong Line [36] in notebook all_ =
len(non_rev_3) + len(rev_0) + len(non_rev_0) + len(rev_3)
this denotes confusion matrix right