Closed LoganDark closed 1 year ago
Please add info about possibility of skipping logits calculation into documentation in rwkv.h
. The commit description is already well written, and can be copy-pasted into the doc.
I would also suggest implementing it in PyTorch wrapper; but this is optional, especially given that the wrapper will be rewritten soon.
There is no need to calculate logits when it is not necessary, so perform some trickery to avoid calculating that part of the cgraph if logits is not actually going to be used.
Depends on #103
Discussed on https://github.com/saharNooby/rwkv.cpp/issues/106#issuecomment-1605890591