Closed qiaw99 closed 1 year ago
score operation for macro, micro, weighted is adapted according to #39, please add corresponding prompts for them. :)
Testing the sample prompts, there seem to be quite a lot of errors. I think we should fix these before we can call this the first DA prototype:
input: what are the reasoning strategies you leverage? parsed: important all [e]
Actual parse:
input: what are the reasoning strategies you leverage? parsed: previousfilter and explain features [e]
explain features
should not be parsed here. Is this a problem with the GPT-Neo parser or are grammar and prompt files for this branch incorrect?
Results in an empty response.
Answer: It's parsing problem.
File "/home/nfel/PycharmProjects/InterroLang/flask_app.py", line 113, in sample_prompt
prompt = sample_prompt_for_action(action,
File "/home/nfel/PycharmProjects/InterroLang/logic/sample_prompts_by_action.py", line 95, in sample_prompt_for_action
raise NameError(message)
NameError: Unable to filename ending in explain.txt!
Apparently, there is also a mismatch with prompt files here.
Answer: fixed in main branch.
input: predict 22 parsed: filter id 22 and predict [e]
File "/home/nfel/PycharmProjects/InterroLang/logic/core.py", line 416, in update_state
returned_item = run_action(
File "/home/nfel/PycharmProjects/InterroLang/logic/action.py", line 46, in run_action
action_return, action_status = actions[p_text](
File "/home/nfel/PycharmProjects/InterroLang/actions/prediction/predict.py", line 291, in predict_operation
return_s = prediction_with_id(model, data, conversation, text)
File "/home/nfel/PycharmProjects/InterroLang/actions/prediction/predict.py", line 238, in prediction_with_id
model_predictions = model.predict(data, text)
File "/home/nfel/PycharmProjects/InterroLang/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1207, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DANetwork' object has no attribute 'predict'
DA classifier does not allow predict operation yet.
Same as above (missing predict attribute in DANetwork
)
Passed!
For instance with id 433: The likelihood of class dummy is 0.759% The likelihood of class inform is 3.779% The likelihood of class question is 2.489% The likelihood of class directive is 88.679% The likelihood of class commissive is 4.293%
For the data with id equal to 2185, the features are
dialog: oh, yes. that would be lovely.
This is not the entire dialogue, but one single turn. I think the same error is happening in the dataset viewer, where each ID is just one turn.
input: can you show me the precision on the data? parsed: score precision [e]
File "/home/nfel/PycharmProjects/InterroLang/flask_app.py", line 139, in get_bot_response
response = BOT.update_state(user_text, conversation)
File "/home/nfel/PycharmProjects/InterroLang/logic/core.py", line 416, in update_state
returned_item = run_action(
File "/home/nfel/PycharmProjects/InterroLang/logic/action.py", line 46, in run_action
action_return, action_status = actions[p_text](
File "/home/nfel/PycharmProjects/InterroLang/actions/prediction/score.py", line 69, in score_operation
raise NotImplementedError(f"Flag {average} is not supported!")
NotImplementedError: Flag [e] is not supported!
Answer:
Passed.
input: what is the way to change the prediction for the data point with the id number 1177 parsed: filter id 1177 and cfe [e]
This sentence is always classified as question!
input: could you describe the data and model? parsed: data and model [e]
File "/home/nfel/PycharmProjects/InterroLang/logic/action.py", line 46, in run_action
action_return, action_status = actions[p_text](
File "/home/nfel/PycharmProjects/InterroLang/actions/metadata/data_summary.py", line 66, in data_operation
score = conversation.describe.get_eval_performance_for_hf_model(dataset_name, conversation.default_metric)
File "/home/nfel/PycharmProjects/InterroLang/logic/dataset_description.py", line 220, in get_eval_performance_for_hf_model
performance_summary = self.get_score_text(y_values,
TypeError: DatasetDescription.get_score_text() missing 2 required positional arguments: 'multi_class' and 'average'
[2023-05-15 13:26:31,850] INFO in flask_app: Exception getting bot response: DatasetDescription.get_score_text() missing 2 required positional arguments: 'multi_class' and 'average'
Answer:
input: show me some data you predict incorrectly parsed: mistake sample [e]
For all the instances in the data, the model is incorrect 1 out of 2387 times (error rate 0.0). Here are the ids of instances the model predicts incorrectly:
[[ 0 1 2 ... 2384 2385 2386]]
This does not seem right to me. Also the way the array is formatted should be changed somehow. I don't know if it's even helpful to print all the incorrectly predicted IDs, maybe just a few of them? What would be most useful for the user?
Passed.
input: what are the labels for all the data parsed: label [e]
For all the instances in the data,: 0.0% instances have label dummy 38.542% instances have label inform 28.907% instances have label question 20.989% instances have label directive 11.563% instances have label commissive
Regarding to explanation and important features: https://github.com/nfelnlp/InterroLang/blob/c1044d33c19d67cfde62b7aabb93170faa0d5327/logic/sample_prompts_by_action.py#L21 fixed in main branch
TODOs:
nlpattribute[DONE]) for DA. Rely on #32Note: Download the model from here: https://cloud.dfki.de/owncloud/index.php/s/m72HGNLW2TyCABr and put it to the right place according to gin config file.