Closed JaktensTid closed 1 year ago
Yes, please refer to the below example. Salesforce/codet5-base-multi-sum
is a CodeT5-base model that are jointly trained on 6 code summarization tasks using CodeSearchNet data.
from transformers import RobertaTokenizer, T5ForConditionalGeneration
if __name__ == '__main__':
tokenizer = RobertaTokenizer.from_pretrained('Salesforce/codet5-base')
model = T5ForConditionalGeneration.from_pretrained('Salesforce/codet5-base-multi-sum')
text = """def svg_to_image(string, size=None):
if isinstance(string, unicode):
string = string.encode('utf-8')
renderer = QtSvg.QSvgRenderer(QtCore.QByteArray(string))
if not renderer.isValid():
raise ValueError('Invalid SVG data.')
if size is None:
size = renderer.defaultSize()
image = QtGui.QImage(size, QtGui.QImage.Format_ARGB32)
painter = QtGui.QPainter(image)
renderer.render(painter)
return image"""
input_ids = tokenizer(text, return_tensors="pt").input_ids
generated_ids = model.generate(input_ids, max_length=20)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
# this prints: "Convert a SVG string to a QImage."
Hi @yuewang-cuhk, thank you for your help! I have another question - is this possible to give an entire class or group of classes instead of single function so model can 'catch up' context more precisely, or I must fine-tune it for this task?
The model should be also able to summarize a larger code patch to some extent. But as it is trained using code-text pairs at the function level, finetuning it on your new use case would be definitely better.
Hi, is there any example of a code summarization? Thank you in advance!