stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy.ai
MIT License
18.75k stars 1.44k forks source link

Questionable Ruff Errors - Define Code Style? #589

Closed mgbvox closed 7 months ago

mgbvox commented 8 months ago

Noticed ruff throwing some pre-commit errors on my dev fork - some of these seem to fly in the face of existing code (not counting the stuff I added), and I'm wondering if we want to ignore them:

A003 Class attribute compile is shadowing a Python builtin

.compile is integral to the dspy API, this probably wants to be ignored.

S101 Use of assert detected

Asserts are used all over the code base - do we want to allow them, or do we want to remove them? Note that asserts are sometimes necessary, e.g. to convince mypy that something isn't None

ERA001 Found commented-out code

Used in places to highlight TODOs; I despise dead code but get that a young project like this might need it in places, perhaps we silence for the time being?

T201 print found

Lots of error logging/warnings appear to be done via print(); is this the intended way to warn the user, or do we want to do something more sophisticated e.g. using logger?

SIM108 Use ternary operator [some code] instead of if-else-block

This is a tricky one; here's the block it's complaining about:

if self.metric:
    metric_val = self.metric(example, prediction, trace)
    if self.metric_threshold:
        # this is the problem bit
        success = metric_val >= self.metric_threshold
    else:
        success = metric_val

While perhaps there's a way to rewrite the x = a >= b form using if-else blocks, I feel like the code as it stands is the most elegant/intuitive. I think I straight up disagree with this flake rule, and would think it should be silenced, but... thoughts?

Also - not sure how long Ruff has been part of the pre-commit flow, but is there a comparable implementation of ruff checks for PRs?

mgbvox commented 8 months ago

467 (made relevant changes here; updates to an existing PR)

mgbvox commented 8 months ago

@okhat tagging so this gets some visibility, and you know who to tag better than me.

okhat commented 8 months ago

@thomasahle @CyrusOfEden @isaacbmiller

okhat commented 8 months ago

I just tagged a few folks who are very knowledgeable about this — and who built the CI things we have. My involvement in this part is very small.

CyrusOfEden commented 8 months ago

These three could be set to ignored in the pyproject:

A003 Class attribute compile is shadowing a Python builtin S101 Use of assert detected ERA001 Found commented-out code

For the others:

T201 print found

Using logger is a good idea for a PR! It's a big change but would provide lots of benefits over print. Quite a wide-sweeping change, and the biggest gotcha is making sure the right stuff still gets printed out in Google Colab / Jupyter notebooks.

SIM108 Use ternary operator [some code] instead of if-else-block

I agree that that's the cleanest way to write that code, but I don't think that's the cleanest code one could write.

I'm actually a little confused by the code you shared there because success implies a boolean but when there's no threshold it's a float? That's weird.

I would recommend that code get refactored, here's a try:

if self.metric:
    metric_val = self.metric(example, prediction, trace)
    if self.metric_threshold:
        metric_val = float(metric_val >= self.metric_threshold)
# replace future success refs with metric_val
thomasahle commented 8 months ago

I agree with Cyrus, except I would add that were definitely using too many asserts that could be more descriptive errors. I think assert has a place (like the mupy use you mention), but it's good to be reminded to use raise. So I'm happy to just ignore line-by-line where an assert is really needed.

I also think the commented out code should be removed and put somewhere else.

isaacbmiller commented 8 months ago

Also because ruff check . brings up 1500 errors, we don't enforce the pre-commit.

It is absolutely arbitrary, but technically only the auto-fixable errors cause CI fails for now.

@CyrusOfEden and I had a conversation a few weeks ago and said it was early in the project for enforcing the pre commit, but very open to creating a more realistic subset or gradually making fixes like the ones you sent