amperser / proselint

A linter for prose.
http://proselint.com
BSD 3-Clause "New" or "Revised" License
4.31k stars 177 forks source link

modernization of proselint #1361

Closed orgua closed 1 month ago

orgua commented 6 months ago

I hereby make use of my copyright and revoke the permission to use any of my code (commits & branches). Good thing is that nothing has been merged yet. I'm licensing it under a private license that specifically forbids usage in proselint.

Nytelife26 commented 6 months ago

Thank you for your contribution. I will review this as soon as possible.

You are right to be concerned about lack of activity with the project. I also offered to maintain it a while ago, but my original point of contact, Dr. Suchow, seemingly vanished. Unfortunately, since I do not have administrative rights over proselint, work on the project without his input and approval is not sustainable.

I do not have the necessary permissions to maintain this project and its associated public facets entirely without reaching into my own wallet and risking future removal from it.

That being said, I like your enthusiasm. It would be useful to have someone to work with on this.

Update: I am unable to determine the problem right now, but I authorised CI for this pull request. Unit tests are failing. This seems to be an environmental issue where the system is failing to find pytest - I am unable to comment further until I review the PR, but this is usually related to poetry.

Update part 2: Having glanced over the changes, it would appear a lot of files (largely HTML and markdown) have phantom additions. Since I am on mobile right now I cannot inspect these properly, but I would be inclined to undo those for now to make it easier to view the actual substance of the PR.

orgua commented 6 months ago

Wow, thanks for the quick answer and helpful insights. I also tried to contact the original authors, but never got any answer. Emails, Accounts and Websites seem dead. Maybe I should look for some more contact-options.

I try to answer your concerns:

If you see fit for the main-branch I'm happy to work on all these points.

Nytelife26 commented 5 months ago

Wow, thanks for the quick answer and helpful insights. I also tried to contact the original authors, but never got any answer. Emails, Accounts and Websites seem dead. Maybe I should look for some more contact-options.

I haven't been able to find any so far. I originally communicated with Dr. Suchow via email and then Slack, but he eventually stopped responding to both. I am not sure how to proceed with the fact that PyPI requires 2FA now - I have only briefly seen mention of it, and I'm not sure if changes will be required to publication methods to proceed. In the worst case, this project will need a new package.

  • review will be hard on this 7000+ LOC Change, but i removed no functionality and only extended unittests. Some things had to be renamed - as old naming seems odd.

That's fine, I can work through it. It just might take a while, which I hope you can understand.

  • ghostchanges in unrelated files: I got tired of all the linting-errors, so i enabled the fixer. it's only correcting file-endings (missing newline) and stray spaces at line-endings.

That makes sense. I might merge those separately to make this pull request easier to manage and sift through.

  • CI, GH-actions & plugins: i put (almost) no time in them for now. i can fix those - but depending on how we proceed I would choose a different approach.

I handled updating all of the CI for this repository originally. I can fix it here too. Although, out of curiosity, what approach are you suggesting?

  • poetry is great in some ways, but on this project i had only trouble with it. for packaging the builtin tooling of python suffice. and as a dev-env i still prefer pipenv, but there are helpers now for poetry in actions
  • linting & other checks were done with a batch-script instead of a clean & maintainable pre-commit-config - i recreated the large parts of the script in modern tooling but didn't update the actions yet

As for these, I am not entirely familiar with the evolution of the Python ecosystem. Tooling seems to be all over the place, so your suggestions in this matter are appreciated. I'll take a look at the pre-commit tooling, as I'm familiar with them from working on Node projects.

Thank you again for this contribution.

Nytelife26 commented 5 months ago

Having read over some of the code, I have another concern. I could be proven wrong later on, but would memoizing lint not return the same errors regardless of a configuration difference from the last run on the same text?

orgua commented 5 months ago

Having read over some of the code, I have another concern. I could be proven wrong later on, but would memoizing lint not return the same errors regardless of a configuration difference from the last run on the same text?

It will not, as the config or rather the active checks are taken into account. Yes, the file has to be rechecked completely when changing the config, BUT this approach removes a substantial ammount of overhead. I assumed config-changes are far less frequent than file-changes - specially when proselint is used as an IDE-linter.

I will review / incorporate your comments now and answer more in detail later.

orgua commented 5 months ago

I haven't been able to find any so far. I originally communicated with Dr. Suchow via email and then Slack, but he eventually stopped responding to both. I am not sure how to proceed with the fact that PyPI requires 2FA now - I have only briefly seen mention of it, and I'm not sure if changes will be required to publication methods to proceed. In the worst case, this project will need a new package.

I found a newer email, but got no reply either. ps: he is Prof. now and according to his paper-output he is alive. more on the package-handling down below.

I handled updating all of the CI for this repository originally. I can fix it here too. Although, out of curiosity, what approach are you suggesting?

my latest dev-branch has pre-commit tests for linting. I would add this to the GH-Actions, together with a pytest-run to check incoming PR.

As for these, I am not entirely familiar with the evolution of the Python ecosystem. Tooling seems to be all over the place, so your suggestions in this matter are appreciated. I'll take a look at the pre-commit tooling, as I'm familiar with them from working on Node projects.

I would ditch poetry and use the python-internals. they are less trouble and work great in actions. For devs I add a pipenv-file to maintain the clean containment feature.

so how do we proceed with the package? Just a proposal: ATM i would opt to switch main-development to a repository with full control. This would enable us to activate branch-protection (only change main with a reviewed PR), GH-hosted documentation and much more. Forks are still able to merge PR from the main-project. I would also add a proselint2 package in pypi with you as a second full-admin - or do you have control over the pypi release? If one of the original creator decides to react the code can always move back to /amperser in the meantime it's probably a good idea to leave a redirect in the readme.

Heads up: I continued work in a dev-branch and added:

Nytelife26 commented 5 months ago

I am currently writing from my phone, so I will reply to the rest of your comments in further depth tomorrow, but I would like to say a few things for now.

To begin, thank you for your efforts to restore contact with Professor Suchow. While I am comfortable to work on this project without his supervision, he possesses far greater knowledge of this field than I do, and is the only person I know of that might be able to provide us with the required permissions to move forward.

Unfortunately, due to GitHub actions secrets I do not have access to, like the PyPI publishing information and the website deployment key, a fork is not an option. I will look into trying to contact some of his associates soon.

orgua commented 5 months ago

Update: I got mail. Contact restored. Looks like we will have a meeting in the near future :)

Nytelife26 commented 5 months ago

Update: I got mail. Contact restored. Looks like we will have a meeting in the near future :)

That's excellent. Would you mind involving me in this?

orgua commented 5 months ago

Sorry, yes of course. I meant "we" as in "we three". Makes no sense for me to do this without you.

Nytelife26 commented 5 months ago

Sorry, yes of course. I meant "we" as in "we three". Makes no sense for me to do this without you.

That's excellent news, thank you. You can reach me at russellt4@coventry.ac.uk if you have updates regarding this matter.

orgua commented 5 months ago

That Email bounced back - Relaying denied. Please check

Nytelife26 commented 5 months ago

That Email bounced back - Relaying denied. Please check

That's unusual. I received it, and replied to you. Let me know if you got it.

Nytelife26 commented 5 months ago

Is there any chance you can keep this PR up to date with your development branch? It's somewhat difficult and unwieldy to see your progress on our conversations when I have to compare between the pull request and the branch in your repository.

orgua commented 5 months ago

Is there any chance you can keep this PR up to date with your development branch? It's somewhat difficult and unwieldy to see your progress on our conversations when I have to compare between the pull request and the branch in your repository.

I can, but also intentionally decided to branch off to not disturb the review process. Also the current PR contains most of the groundwork for further changes without changing too much on the outer interfaces. It's somehow tested and "complete". My thought was to maybe review this PR, implement fixes in /dev and when current /dev is mature also review that one separately and merge all together. As current /dev touches less files it should be easier to handle.

I am open for ideas. Bigger changes to the outer interface are unavoidable for some features.

If you really insist on me merging current /dev I would at least fix one open failing test in the unittests and also update the Action-script. Won't have time before the weekend though.

Nytelife26 commented 5 months ago

If you really insist on me merging current /dev I would at least fix one open failing test in the unittests and also update the Action-script. Won't have time before the weekend though.

That's fine. I can get the tests working again at least. Enjoy the remainder of your week.

Nytelife26 commented 5 months ago

I am currently rebasing the branch, so avoid making any new pushes until further notice. You may also need to rebase your development branch as a consequence, unfortunately.

orgua commented 5 months ago

this will be very messy as most of this is already fixed or changed substantially in the PR.

Nytelife26 commented 5 months ago

Correct, but I would've needed to rebase it eventually anyway.

Also, I noticed something in the test suite. Why does it now throw SystemExit instead of FileNotFoundError when an invalid file is given?

orgua commented 5 months ago

Also, I noticed something in the test suite. Why does it now throw SystemExit instead of FileNotFoundError when an invalid file is given?

Click is handling that internally. Exception not needed, just warn and exit with 1.

Nytelife26 commented 5 months ago

Click is handling that internally. Exception not needed, just warn and exit with 1.

Gotcha. I'll remove those assertions, then. Although, for some reason, the clear cache test is now failing in your PR.

tests/test_clear_cache.py:75:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <MagicMock name='remove' id='140343936190480'>, args = ('./a.pyc',), kwargs = {}, expected = "remove('./a.pyc')"
actual = 'not called.', error_message = "expected call not found.\nExpected: remove('./a.pyc')\nActual: not called."

    def assert_called_with(self, /, *args, **kwargs):
        """assert that the last call was made with the specified arguments.

        Raises an AssertionError if the args and keyword args passed in are
        different to the last call to the mock."""
        if self.call_args is None:
            expected = self._format_mock_call_signature(args, kwargs)
            actual = 'not called.'
            error_message = ('expected call not found.\nExpected: %s\nActual: %s'
                    % (expected, actual))
>           raise AssertionError(error_message)
E           AssertionError: expected call not found.
E           Expected: remove('./a.pyc')
E           Actual: not called.

/usr/lib/python3.11/unittest/mock.py:924: AssertionError

I'm not really sure what to do about this.

orgua commented 5 months ago

I'm not really sure what to do about this.

Thats clearly from the old caching-system. That shouldn't be in there - and it isn't. Must be a uncommitted fragment on your side. Can't test right now, but I think the tests ran through.

Nytelife26 commented 5 months ago

Must be a uncommitted fragment on your side. Can't test right now, but I think the tests ran through.

That's fair. I'm about 35/117 through the rebase. This is certainly the largest PR I have ever dealt with, which is exciting.

orgua commented 5 months ago

I still don't understand why the rebase is so massive. there are only 2 new commits. Isn't that very prone to errors?

Nytelife26 commented 5 months ago

I have managed to rebase everything successfully, and get pytest to run. A new problem is that TypeAlias is not available on Python versions prior to 3.10. I am not sure how you would like to proceed with that.

orgua commented 5 months ago

I have managed to rebase everything successfully, and get pytest to run. A new problem is that TypeAlias is not available on Python versions prior to 3.10. I am not sure how you would like to proceed with that.

Already replaced one of the typeAlias with a named tuple, will also extend the second as it's cleaner.

That latest force-push really messes up a lot - I had to do a rebase on my current main (this PR) and hoped /dev would be smoother, but its a hot mess. I have to handle 80+ commits indiviually, deleted files are reappearing randomly. Update: I stopped the rebase and did a merge commit. The PR seems to work - still WIP. Is it ok for you to review it there? I would propose to: 1) bring the /dev-addition to a working state 2) update proselint/main next as handling this gets more and more complicated 3) after that we can talk about the next steps. A bunch of issues can be closed. Do you wanna do a release inbetween or change more core-structures and merge in PRs?

Nytelife26 commented 5 months ago

Already replaced one of the typeAlias with a named tuple, will also extend the second as it's cleaner.

That's good news, thank you.

That latest force-push really messes up a lot - I had to do a rebase on my current main (this PR) and hoped /dev would be smoother, but its a hot mess. I have to handle 80+ commits indiviually, deleted files are reappearing randomly.

I know it's not ideal, and I apologise sincerely for that. Thank you for sticking with me and continuing to work on it.

Update: I stopped the rebase and did a merge commit. The PR seems to work - still WIP. Is it ok for you to review it there?

Of course. I'll take a look shortly.

  1. update proselint/main next as handling this gets more and more complicated

To make this easier, it might be possible to split this into multiple more atomic PRs. I am not sure how you would like to group them, but that would be my suggestion at this stage. I am of course okay with proceeding to review these in whatever structure you like, that's just a comment from my experience.

  1. after that we can talk about the next steps. A bunch of issues can be closed. Do you wanna do a release inbetween or change more core-structures and merge in PRs?

I would like to at least make progress on the existing PRs around refactoring the configuration system and such, but it might be better to release most of these changes first. Having too many breaking changes in one release could prove hard for people to keep up with. The userbase is, to my knowledge, already dwindling from its former state, so driving more people out would not be ideal.

orgua commented 5 months ago

Allright. Don't know what PR-splitting brings, but OK. You are right - config needs a major overhaul, same for the check-structure. I already opened two issues to collect ideas and coordinate the effort to put it on a future-proof foundation.

reference:

Nytelife26 commented 5 months ago

I am working on the check categories. I have some basic ideas for now, but more is to follow later.

Nytelife26 commented 4 months ago

As a follow-up, you were right. Splitting this PR up would be a nightmare. It would be better to proceed with bringing in your dev branch changes, tidy up the commits here, and then merge this before we work on the other proposals.

Nytelife26 commented 1 month ago

note for myself: it would be useful to consider the discussion in #1369 as part of this, or maybe as a later change.