Open chenwuperth opened 9 months ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
Hello, It seems that the
gather_output
parameter is always set toFalse
in theembedding_to_parallel_embedding
function. In some use cases, I found it is essential to set it to True in order to gather lm_head from all ranks before going ahead with the next step (e.g. adding some pre-defined bias terms, etc.). Could this parameter become configurable and exposed in theembedding_to_parallel_embedding
function signature? thanks.