Closed lorafei closed 2 years ago
Hi, note that the MSC-RAG model has a few hyperparameters, not exactly the same as your train cmd
-t msc
to --t msc:Session1Self:is_convai2_session_level=True,msc,msc:SessionBaseMsc:session_id=2,msc:SessionBaseMsc:session_id=3,msc:SessionBaseMsc:session_id=4
--min-doc-token-length 128
instead of 64 --max-doc-token-leng 256
instead of 128 as it needs to deal with raw history;--retriever-ignore-phrases persona:,__his__
, --memory-extractor-phrase persona:,__his__
, --memory-delimiter <SPECIALTOKEN>
--previous-session-delimiter <SPECIALTOKEN>
to make sure retrieving correct memory documents on session levelPlease let me know if you can reproduce similar numbers
This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.
Hi!
I am reproducing MSC 2.7B (RAG) results in the “Beyond Goldfish Memory: ∗ Long-Term Open-Domain Conversation” paper. In the first place, I used the same hyperparameters as the released checkpoint _zoo:msc/summscrag3B/model, except for using the summsc, I used the original msc. However, I cannot reproduce the same performance in the paper of the MSC 2.7B (RAG) model.
And the results in the paper is
My training script is