If someone uses your embeddings of a source code for any downstream task your model supports, do you recommend using F1 score or accuracy to evaluate your model and why, please? In case your model F1 score is low but high accuracy, what would you recommend?
Hello,
If someone uses your embeddings of a source code for any downstream task your model supports, do you recommend using F1 score or accuracy to evaluate your model and why, please? In case your model F1 score is low but high accuracy, what would you recommend?