Closed igor17400 closed 8 months ago
Hi,
Thank you for your interest in the library and for reaching out.
Since you used a specific experiment configuration for training in which the default values from the specific modules (e.g., data, model etc.) are overridden, you need to specify the same experimental setup also in evaluation. In the default eval.yaml
configuration file, there are currently no default data or model configurations specified, hence the errors. You can easily fix this with the following command:
python eval.py experiment=nrms_mindsmall_pretrainedemb_celoss_bertsent.yaml
You also have two options to specify the checkpoint:
eval.yaml
file; however, this means you need to change the checkpoint path for each new run of the eval.py
function.python eval.py experiment=nrms_mindsmall_pretrainedemb_celoss_bertsent.yaml ckpt_path=YOUR_CKPT_PATH
I hope this solves your problem, and please let me know if you have further questions.
Thank you very much for your response @andreeaiana!
After running the command you provided, I faced some issues with the eval.yaml
file. To address this, I've submitted a pull request (https://github.com/andreeaiana/newsreclib/pull/8) that includes a bug fix and additional documentation for the MINDlarge
dataset, along with a sample configuration for it (nrms_mindlarge_pretrainedemb_celoss_bertsent.yaml
).
In addition to the PR, I would like to highlight two observations from my experimentation that I believe could enhance the user experience and accuracy of the evaluation process:
As depicted in the attached image, there's a progress indicator (red square) during the evaluation phase. However, once this loading bar finishes achieving 2288/228
, the console seems to be "frozen" for an extended period, initially suggesting a bug. It turned out to be the time required to compute the calculation of all the scores (blue square). I propose introducing a progress bar for this scoring calculation phase. That is, a loading bar showing that auc
, categ_div@k
, categ_pers@k
, etc are being calculated in order to clarify that the process is ongoing and not stalled. Do you think this enhancement would be something worth implementing?
In several of my test runs, I've noticed identical values for ndcg@5
and ndcg@10
as you can see on the attached image. Have you experienced similar outcomes in your tests?
If you think these ideas are worth looking into, I'd be really keen to dive in and work on improving them!
After running the command you provided, I faced some issues with the
eval.yaml
file. To address this, I've submitted a pull request (#8) that includes a bug fix and additional documentation for theMINDlarge
dataset, along with a sample configuration for it (nrms_mindlarge_pretrainedemb_celoss_bertsent.yaml
).
Many thanks for the bug fix and extra documentation @igor17400, I accepted your PR!
In addition to the PR, I would like to highlight two observations from my experimentation that I believe could enhance the user experience and accuracy of the evaluation process:
- As depicted in the attached image, there's a progress indicator (red square) during the evaluation phase. However, once this loading bar finishes achieving
2288/228
, the console seems to be "frozen" for an extended period, initially suggesting a bug. It turned out to be the time required to compute the calculation of all the scores (blue square). I propose introducing a progress bar for this scoring calculation phase. That is, a loading bar showing thatauc
,categ_div@k
,categ_pers@k
, etc are being calculated in order to clarify that the process is ongoing and not stalled. Do you think this enhancement would be something worth implementing?
I agree with you, an additional progress bar for the scoring calculation phase would be a lot more informative.
- In several of my test runs, I've noticed identical values for
ndcg@5
andndcg@10
as you can see on the attached image. Have you experienced similar outcomes in your tests?
I have never encountered this issue before, for all my experiments ndcg@5
and ndcg@10
are always different, with the former being lower than the latter.
If you think these ideas are worth looking into, I'd be really keen to dive in and work on improving them!
Thanks for proposing these enhancements, they would definitely improve the current functionality. It would be great if you could work on them :)
Great @andreeaiana! I'll start working on those feature now 😊
I just executed the evaluation of nrms_mindsmall_pretrainedemb_celoss_bertsent
again and obtained the exactly same scores for ndcg@5
and ndcg@10
as it can be seen on the attached image below. I'm trying to understand why is this happening.
I believe I found the reason. The setup.py
file specifies an old version of torchmetrics
(0.11.4
). After updating it to version 1.3.1
, the metrics are now being computed correctly.
I believe I found the reason. The
setup.py
file specifies an old version oftorchmetrics
(0.11.4
). After updating it to version1.3.1
, the metrics are now being computed correctly.
Thanks for letting me know, I updated the setup.py
, requirements.txt
, and environment.yaml
file with the latest version of torchmetrics
.
Hi there,
Firstly, congratulations on the great work and the publication of this very useful library!
I'm encountering some bug while attempting to execute the
eval.py
script. I've successfully trained the NRMS model for 20 epochs. However, my machine disconnected before the automatic testing phase.I trained the model using the configuration provided in
nrms_mindsmall_pretrainedemb_celoss_bertsent.yaml
.Now, I'm attempting to execute the
eval.py
script with theeval.yaml
file defined as shown below. However, when I run the commandpython eval.py
, I receive the following error:Do you have any insights into why this might be happening?
eval.yaml
This description outlines the issue I'm facing while attempting to execute the
eval.py
script after training the NRMS model. Any help or suggestions would be greatly appreciated!Thank you!