Voyz / ibind

IBind is a REST and WebSocket client library for Interactive Brokers Client Portal Web API.
Apache License 2.0
235 stars 32 forks source link

AI-generated code in issues #94

Open Voyz opened 1 month ago

Voyz commented 1 month ago

hey @weklund, @salsasepp and anyone else interested - I wanted to get your thoughts on the topic of AI-generated code from the perspective of maintainers of this library.

I've noticed that some users paste AI-generated code into issues, often without checking the documentation, reading error messages or understanding what their code is doing.

For example see recent issue 92: IbkrClient.positions() seems limited to 100 results, pagination parameters ineffective (TypeError)

The user opened a verbose issue because they passed hallucinated page_id and pageSize arguments to client.positions(), which clearly doesn’t support those. The traceback even said as much.

While I, and probably everyone else these days, use AI to generate code - hell I even used AI to discuss this very topic first - I feel that this kind of behaviour crosses the line. This wastes our time and bloats the issue tracker with noise. If they can't be bothered to read their code, it's a little arrogant to ask us to do so.

I'm considering introducing a formal policy, for example:

“Issues that stem from AI-generated code and demonstrate little to no effort in understanding the error, reading the documentation, or checking the method signatures will be closed immediately.”

With it, I'd suggest our internal maintainers' policy to immediately close the issues and post an appropriate message as soon as we'd spot code along the lines of:

# Import the utility function if it's accessible (adjust path if necessary)
# Example using get_ibkr_client (replace "live_trading" if needed)
# Assuming you have this helper

Or any other telltale that the code has been AI-generated. While this may be seem harsh, I think that it will:

While this will be a step towards being a little less beginner friendly, I think it will set the right tone for the discussions, and indicate that we value our time and expect the same of the others.

We could add this policy to the issues template, possibly along with the CONTRIBUTING.md. The closing message we'd post can encourage the user to review their code, redact it and that we may reopen it once we see they took it seriously.

Lastly: While it is a little annoying to see this behaviour, I'm not particularly biased towards this new policy. I think it is a good idea to already put it in place, but I'm open to be convinced otherwise since this is more of a sporadic occurrence at the moment. I'd probably stand behind it more strongly if it happened more frequently.

salsasepp commented 1 month ago

Oh! I didn't even realize the code in #92 might have been AI generated (with AI knowing IBKR's docs, but not ibind code apparently). Interesting. Thank you for bringing this up, will think about it.

weklund commented 1 month ago

For now, I don't believe we should set a formal policy against using AI-generated content because the number of issues with hallucinations should go down due to AI tools getting better, and efforts we can take to better describe types. Being friendly towards issues with mistakes will earn more trust with users in the long run. If the volume gets crazy I would reconsider ha.

Yes the given example is pretty bad... just reading the stack trace someone could see what the issue is.

I realize I haven't been as active addressing open issues for IBind, but I can empathize with the frustration on increased load from poorly formed Github issues. I want to push myself to help out more on that front. :)

I use Cursor today with the Gemini 2.5 Max model and its coding agent is able to understand IBind's underlying implementation when writing code for my personal project. My dev env would dynamically see the error and rerun it's implementation to feedback loop for a fix.

On the topic of how we could help address the exact issue, I'm wondering if there's improvements we can make on the typing front. Now I don't have a complete argument ready yet, but as I'm using it more and more in my personal project I find the amount of code I have to write to handle typing and response validation pretty heavy. I can post a separate issue to layout what I mean, but I'm wondering if using PyDantic models could a) reduce the code IBind users have to write and b) give more verbose structure on the interaction with IBind that humans and AI alike can better understand it.

Voyz commented 1 month ago

@weklund

I realize I haven't been as active addressing open issues for IBind

Honestly, don't feel bad. You did superb job leading the PR front with the CI, contributing, issue templates, formatting, all that. I think we did splendidly focusing on different responsibilities here. You're doing great, and @salsasepp has been an enormous help with the issues recently.


On the actual topic: can you clarify your position? I see that you mention AI tools getting better which is a point against it, but you also write that you empathise with the frustration.

I 100% agree that being friendly to people making mistakes is worth doing. Making a mistake and/or being inexperienced is different than not putting in the effort.

AI tools getting better or not, that user has simply not put in the effort to read their own code and the documentation. This will happen no matter how good AI tools get. You know, we just added that issue template asking people to read the IBind docs; now we have a user who couldn't be asked to read them - should we not react? Are we being friendly to people making mistakes here?


On PyDantic - very interesting stuff, please open a new issue and let's take it from there 👏

lach1010 commented 1 month ago

+1 on Pydantic!

One thing to keep in mind here is not all responses have all values every time which may somewhat take away from the value.

On the AI policy. I don't think it would in any way change a users behaviour - particularly those who are blindly pasting generated code.

I'd also think that most blind copy pasters wouldn't first think to post an issue in the repo - but look to docs first. The beginner trouble shooting steps wouldn't start with an issue on an GitHub repo. Although perhaps this isn't quite as true when docs are a wiki on the repo?

I think an immediate close might make people feel slighted.

Also as with the channel vs topic/target discussion there is/can be inconsistencies between the IBKR API and the parameters in IBind. As such I think confusion from both an LLM and user is understandable.

salsasepp commented 1 month ago

I agree with what @weklund and @lach1010 have been saying about AI-generated code. We may soon reach a point where all code is AI-generated partly or in full, so I believe the question is really the other half of @Voyz's proposal: "...and demonstrate little to no effort in understanding the error, reading the documentation, or checking the method signatures..."

My personal preference is to try and help people out, and give them the benefit of doubt. I am clueless myself, on so many topics. I will gladly invest 10 minutes into an answer, when at the same time I am getting to know ibind code and docs better, or python for that matter. I'd normally only bail out if a person demonstrates continued unwillingness to invest time themselves.

I'd be hesitant to introduce a policy that might adversely affect legitimate questions and issues, especially when the wording of the policy would necessarily be somewhat diffuse and subject to interpretation. The absence of a policy, on the other hand, would mean the absence of alignment between maintainers and other participants, which is also a bad thing.

I don't seem to have a clear opinion.

Voyz commented 1 month ago

@lach1010 thanks for your thoughts!

On the AI policy. I don't think it would in any way change a users behaviour

If directly addressing the problem with the users in question, what do you think would? I can't imagine a situation when we close an issue for a user who looks for genuine advice, ask them to avoid posting non-redacted AI-generated code, and they come back doing the exact same mistake. I indeed think it will be one method to educate and improve - how else would they know it's not okay?

I'd also think that most blind copy pasters wouldn't first think to post an issue in the repo - but look to docs first.

My experience is the opposite. A large amount of issues for the repositories I manage come from not reading the docs and/or other issues.

I think an immediate close might make people feel slighted.

Yes, but that's inherent to any issue being closed by anyone else than its author. Should we not close duplicates for the same reason? Irrelevant questions? There has to be a line drawn somewhere and I think formulating the closing message appropriately will make this a polite and reasonable process. Not sort of 'you're lazy go somewhere else', but 'hey, here's what you did wrong, please try not doing this again'.

Also as with the channel vs topic/target discussion there is/can be inconsistencies between the IBKR API and the parameters in IBind. As such I think confusion from both an LLM and user is understandable.

Fair point on the topic/channel, but also note that:

Irrespectively of how the discrepancies appear, the question is more about the minimum effort were asking from the users. Would working on an issue presented as 'hey please fix my code: ...' be reasonable? I don't think so.

Confusion is understandable; the point of introducing a policy is about the required effort put from the issue author. Non-redacted long AI-generated code issues are redirecting the effort to read the code from the author to maintainers. And I'm asking whether we think this is an expected interaction between the maintainers and the users.

Voyz commented 1 month ago

@salsasepp thanks for your thoughts too 👍

We may soon reach a point where all code is AI-generated partly or in full so I believe the question is really the other half of @Voyz's proposal: "...and demonstrate little to no effort in understanding the error, reading the documentation, or checking the method signatures..."

I think that part of the proposal is already assumed, but maybe you're right, it should be specified.

When this happens I indeed leave a message asking for more effort without attempting to answer the question until that happens.

I'd be hesitant to introduce a policy that might adversely affect legitimate questions and issues

I 100% agree. I'm asking how we should define 'leigitimate'. Would not reading the docs and the stack trace be an example of a legitimate issue?

Additionally, questions - legitimate or not - should be prepared for review by the maintainer. Doing an AI-generated code dump is just one example of making a question that is not prepared for review.

Voyz commented 2 weeks ago

hey @weklund @salsasepp @lach1010 I wanted to bump this topic as I've seen two new sloppy AI-generated contributions since I brough this up; I think it would be useful to slowly move towards a solution regarding this.

From re-reading everyone's points, I think it boils down to - what should we consider "good enough" to invest the time into. AI-generated or not, the question seems to pivot towards the amount of effort put into the contribution and what to do if the author clearly shows little effort (ie. the contribution is overly long, includes verbose repetitions, shows they haven't read their stack trace, etc.)

weklund commented 2 weeks ago

Hmm. I don't have an overall point, but I have 2 observations:

  1. Is this a function of a growing community doing this, or maybe a single person? Does this change how we think about this and the policy?
  2. Would it be better if we were more aggressive on closing issues that are stale? That way working with open issues seems more manageable, so it isn't as bad to address an issue that potentially isn't good enough?

Trying to brainstorm the smallest policy change.

Voyz commented 2 weeks ago

@weklund

Is this a function of a growing community doing this, or maybe a single person? Does this change how we think about this and the policy?

Different people. So I see it as a potential reoccurring challenge.

Would it be better if we were more aggressive on closing issues that are stale? That way working with open issues seems more manageable, so it isn't as bad to address an issue that potentially isn't good enough?

I think this misses the point a little. If there were 0 open issues and someone posted a new one that's low effort / AI slop, then I'd still rather we spend our time on meaningful things - say updating the docs, adding new endpoints - or other projects.

salsasepp commented 1 week ago

I'm afraid I won't have any meaningful input to this issue. For me, the recent "AI slop issues" have triggered me to install AI into my IDE for the first time ever, which has definitely improved my understanding of what's going on with copilot & friends. I'm learning even from slop, it seems. The size and frequency of "low-effort" issues is still below the threshold that would annoy me personally. I definitely see how ibind maintainers could view this differently. That's all I can say.

weklund commented 22 hours ago

Referencing Pydantic here as well. https://github.com/Voyz/ibind/issues/105

Would love your thoughts as well @lach1010 😁