Open allomy opened 2 years ago
Hi @allomy , Thank you for your interest in code2vec!
I think that you can loss here: https://github.com/tech-srl/code2vec/blob/master/tensorflow_model.py#L228 from the standard cross entropy to sigmoid cross entropy: https://www.tensorflow.org/api_docs/python/tf/compat/v1/nn/sigmoid_cross_entropy_with_logits
But you will also need to change the pipeline to support reading multi-labeled examples. Follow the variable target_index
here: https://github.com/tech-srl/code2vec/blob/master/path_context_reader.py
and modify it to get a list of targets for every example.
Best, Uri
Hi @urialon , thank you for your quick response. I'll try it soon.
Hi @urialon , sorry for the delay response that I have tried to modify the code related to target_index
, but was lost in the code... Could you give more information about modifying it to get a list of targets for every sample? Thank you in advance for your help.
Hi @allomy , Actually it might be easiest for you to use https://code2seq.org/ . It predicts a sequence of labels and not multi-label, but it may either be a good approximation, or easier to adapt for multi-label (just change the loss computation, not the entire data reading pipeline).
Best, Uri
Thank you @urialon , I will take a look at code2seq.
I'm trying to use code2vec for multi-label classification, that one sample belongs to several labels, could you give some suggestions what to do with the model?
Thank you in advance for your help!