haesleinhuepf / human-eval-bia

Benchmarking Large Language Models for Bio-Image Analysis Code Generation
MIT License
13 stars 4 forks source link

What about future models learning from our resource? #53

Open tischi opened 2 months ago

tischi commented 2 months ago

I wonder what to do with the following observation: Future LLMs will see our repo during training and thus probably be very good add passing our tests. Is that a good or a bad thing?

haesleinhuepf commented 2 months ago

That's good! I just wrote a sentence about this in the paper: "Hence, our test-cases may enable the LLM community to train models covering our BIA use-cases better"