mdmintz / pynose

pynose fixes nose to extend unittest and make testing easier
https://pypi.org/project/pynose/
GNU Lesser General Public License v2.1
11 stars 6 forks source link

Clearing up license confusion (post-mortem) #36

Open mdmintz opened 1 month ago

mdmintz commented 1 month ago

Clearing up license confusion (post-mortem)

For those of you who missed the action, a large number of people recently showed up in a flash mob to discuss and/or complain about license issues in pynose (and some of my other repos). They mainly came from Reddit and 4Chan. (Source: "GitHub Insights")

Screenshot 2024-07-11 at 10 17 14 PM

Some big questions: Were the claims justified? Was I unfairly targeted? Are there other popular repos with the same issues? Let's go through the points that were brought up and see based on some helpful questions:

(Question) Can a repo fork/copy from another repo while removing history? (The part in question: https://github.com/mdmintz/pynose/commit/5b7314a0a7c1ad61a98fb506dca54a095aa9ad30, where pynose was created from a modified version of nose.)

(Answer) At it turns out, yes, that's legal: That's how Microsoft created Playwright from Google's Puppeteer: See https://github.com/microsoft/playwright/commit/9ba375c06344835d783fe60bf33f857f9bc208a4, which was made from a modified copy of Puppeteer (https://github.com/puppeteer/puppeteer).

(Question) Can a repo change its license to MIT from something else?

(Discussion) The Puppeteer License is Apache: https://github.com/puppeteer/puppeteer/blob/main/LICENSE. However, when Microsoft created Playwright, they changed the original license to MIT: https://github.com/microsoft/playwright/commit/794b59c0b547f6f7a43b457fbff03b126ee124ce. Certainly looks legal if Microsoft can do it. Turns out that maybe it wasn't OK because they later changed it back: https://github.com/microsoft/playwright/commit/562e6f5fe19e40dab57574eda39d2049ffdf8a28. So even with code reviews and a very large legal team to double-check things, even big companies can get licensing wrong sometimes. If that's the case, then certainly smaller teams (or even individual repo maintainers) may get licensing wrong, or not know correct licensing from wrong licensing if the repos they're learning from didn't get it right either. I ended up "pulling a Microsoft" by setting a license to MIT from a non-MIT license. Got it fixed though: https://github.com/mdmintz/pynose/pull/30. Also fixed a secondary license issue: https://github.com/mdmintz/pynose/pull/34. In the process of that secondary fix, I learned there there was another repo (not mine) that also had a license issue: https://github.com/pdbpp/pdbpp. After pointing it out, someone opened a ticket for it.

(More Discussion) As it turns out, licensing issues are quite common: https://github.blog/2015-03-09-open-source-license-usage-on-github-com/ (may be an old article, but it says only 20% of repos have a license (30% for forked ones) and that "Open source simply isn’t open source without a proper license." So although there was a licensing issue with pynose (now fixed), there was a disproportionate response focused at my GitHub. Lots of people disrespected not only me, but also one of the original nose maintainers who came to help. (They downvoted him because he thanked me for resurrecting nose. If you look through the other comments on the thread, anyone who said positive things about me got downvoted.)

Screenshot 2024-07-12 at 8 32 08 AM

For some context (as not everyone here knows) nose is (or once was) a very popular Python unit-testing framework that hasn't been maintained in over 8 years:

Screenshot 2024-07-12 at 8 41 25 AM

Major companies around the world still depend on it. Unfortunately, nose stopped working when Python 3.10 came out. Although it was easy to patch it at that point, the number of things that broke with nose increased at a rapid rate with the releases of Python 3.11 and Python 3.12. People either didn't want to fix it, or didn't know how to fix it. Although I'm quite busy with a lot of other things, I decided to fix it because I knew how to do it. (I've been using Python ever since working at ITA Software, which was acquired by Google.) So I took on that "burden" and created pynose. Major companies that were still dependent on nose began using it. Those companies include big names like Mozilla, Intel, DocuSign, Wikimedia, and SAP:

Screenshot 2024-07-12 at 9 32 45 AM Screenshot 2024-07-12 at 9 09 06 AM Screenshot 2024-07-12 at 9 12 06 AM Screenshot 2024-07-12 at 9 10 58 AM

Some of my fixes for nose were shipped with Alpine Linux. Eg: (Meaning that they would be found on Azure, Google Cloud, AWS, and Docker instances around the world.) https://github.com/alpinelinux/aports/blob/5fb0b96b79977fd89ee20f1d2bd3367762df67a1/community/py3-nose/python-nose-py312.patch

With people finding out about the popularity of pynose, they came by and then called in others from 4Chan, Reddit, Mastodon, etc. While some people did offer constructive criticism, there were many others that either came by to just rant, or to wave torches & pitchforks. The ones with the torches & pitchforks mostly came from https://boards.4chan.org/g/thread/101339536. Some of the people there made comments that were way out-of-line and very offensive. (You can read their long thread and make your own assessments.) There were lots of extremely hostile messages on 4Chan, and calls for people to come downvote my pynose posts.

Some tickets opened in pynose were more helpful than others. Eg. This was helpful: https://github.com/mdmintz/pynose/issues/33 (Clear points about licensing rules so that the problems could be described in detail, and fixed accordingly.) This earlier one was not as helpful: https://github.com/mdmintz/pynose/issues/28 (Fewer details and the mention of preserving history, which as I mentioned earlier with the Microsoft example here: https://github.com/microsoft/playwright/commit/9ba375c06344835d783fe60bf33f857f9bc208a4 shows that preserving the Git commit history of the original repo is not necessary.) Eventually, I sorted out necessary changes from non-necessary demands by using Microsoft's Playwright repo as a case study. Both pynose and playwright made similar decisions / mistakes, as posted earlier. (The mistakes that needed to be corrected have already been fixed.)

(Question) Was pynose not giving credit to nose?

(Answer) The ReadMe clearly stated at the top that "pynose is an updated version of nose, originally made by Jason Pellerin." Credit was definitely acknowledged and given. (But for some, the ReadMe didn't count because they only cared about the license.)

Screenshot 2024-07-11 at 10 06 31 PM

One of the three official maintainers of nose spoke positively about pynose fixing nose and keeping it alive:

Screenshot 2024-07-12 at 10 39 56 AM

For reference, here are the three official nose maintainers according to PyPI:

Screenshot 2024-07-09 at 9 16 43 AM

Let's get back to the "Questions":

(Question) Can a license be slightly modified from the original to include new maintainers for a forked / copied project?

(Answer) Yes, Microsoft added their name when they modified Google's code: https://github.com/microsoft/playwright/commit/9ba375c06344835d783fe60bf33f857f9bc208a4#diff-0a2cb6528fb78d67f03776f9e443ba3b811ecb8cab767af904e48604197c922b

Screenshot 2024-07-12 at 12 50 11 AM

If that's legal, then I can also add my name when modifying code. (Context: https://github.com/mdmintz/tabcompleter/pull/11, where someone was trying to tell me that I can't do that after returning the original license.)

(Question) Can I put a license for a specific file directly in the file itself, rather than including it in the main LICENSE file?

(Answer) Yes, Microsoft did it: https://github.com/microsoft/playwright/commit/9ba375c06344835d783fe60bf33f857f9bc208a4#diff-647cd6d72ffd0e5a5e9ba4f459fb9d36bb7b9aa621723e0eb7b221e1d9bc67bcR2 - Copyright 2017 Google Inc., PhantomJS Authors All rights reserved. in the file itself. - The main licenses did not include any mention of PhantomJS. (Source: https://github.com/microsoft/playwright/blob/71a668eb863ca44e269f8353bfb055d7e0d4e583/LICENSE. It also wasn't in their ThirdPartyNotices.txt file: https://github.com/microsoft/playwright/blob/71a668eb863ca44e269f8353bfb055d7e0d4e583/packages/playwright/ThirdPartyNotices.txt)

Someone came after one of my repos without knowing that putting specific licenses directly into files was OK:

Screenshot 2024-07-12 at 1 19 59 AM

The files were copied directly from their CDN links, which meant that the license would be there if it wasn't missing in the CDN. Here's an example of that:

Screenshot 2024-07-12 at 1 32 49 AM

Therefore, the license would only be missing there if the CDN link didn't include it. (Maybe a CDN issue if the license wasn't uploaded with the JS or CSS code from there.) The JS and CSS file copies would be from there, as well as any SeleniumBase Chrome extension zip files included directly in the repo. Here's another example of the license in the file: https://github.com/seleniumbase/resource-files/blob/main/js/hopscotch/hopscotch.min.js. I deleted a few of his invalid tickets for that (for him not realizing that the license can be included within the files themselves). Hence the reason you might not find the ticket I copied from the email notification I posted above. For fairness sake, I didn't delete other tickets of his when there were valid points, eg: https://github.com/mdmintz/tabcompleter/issues/10. (He did complain later on social media that I deleted a few of his tickets.)

On the topic of SeleniumBase, although the https://github.com/mdmintz org falls under my responsibility, my https://github.com/seleniumbase org falls under the special protection of the Software Freedom Conservancy (due to being part of the Selenium umbrella of frameworks). This means that if anyone has a license issue or any legal issue with a repo in the SeleniumBase org, then they need to go through the Software Freedom Conservancy instead of going directly through me. For regular SeleniumBase issues (non-licensing stuff) you can go directly through me (opening a regular ticket). For any possible license issues that you may have with SeleniumBase, go directly to the Software Freedom Conservancy: https://sfconservancy.org/news/2011/feb/02/selenium-joins/ As written there: By joining the Conservancy, Selenium obtains the benefits of a formal non-profit organizational structure while keeping the project focused on software development and documentation. Some benefits of joining the Conservancy include the ability to collect donations, hold assets on behalf of the project, and some protection of the lead developers of the project from personal liability when engaging in the activities of the project. So specifically for SeleniumBase, they have my back.

So in summary, open source license rules can get very complicated: Even big corporations can make mistakes. If a big company does something incorrect with respect to licensing, it's easy for individual developers learning from those repos to make the same mistakes without realizing it. Sometimes, even the people coming to complain about a license issue may get some things wrong (Eg. Them thinking that history from a forked/copied repo needs to be preserved, which clearly isn't the case because this happened: https://github.com/microsoft/playwright/commit/9ba375c06344835d783fe60bf33f857f9bc208a4, where Google's Puppeteer Git History was removed during the creation of Microsoft's Playwright repo.) Also, some people are more helpful than others in resolving things (by providing useful, actionable feedback). Then there are others out there who are just trying to mess with other people's reputations. The GitHub ecosystem should be a welcoming space for all developers.

For anyone skipping right to the end of this long message, all outstanding requests have been resolved, people are happy with the results, and pynose will continue to be shipped with Linux distributions around the world.

Screenshot 2024-07-12 at 12 14 59 PM

And now people know me a bit better. In particular, they know I'm the guy who fixes unmaintained Python packages that businesses still depend on. Eg. pynose, as well as others like pdbp (not to be confused with pdb or pdbpp). And they know I'm the guy who does a lot with web automation (SeleniumBase). With all the work I do, one would think that I don't get much of chance to go outside, but I did manage to attend ballroom dance class a few evenings this week, and I recently went to a Star Trek convention where I survived for a whole three days without opening my laptop (https://www.youtube.com/watch?v=BwHc4lIS5z8). There, I partied on the set of the original Enterprise with Jonathon Frakes, and I had a fun conversation with LeVar Burton.

OK, back to work, everyone! There's lots of Python code to write!

https://github.com/mdmintz

pbrkr commented 1 month ago

I didn't "come after one of your repos", I raised an issue that I saw.

I'm sorry that this turned into a pile-on, I'm deeply unhappy with how agressive the comments on the GitHub issues got and I need to give that post mortem of my own at some point as there are some lessons for me to learn. In particular I need to be more careful about venting steam on social media when there is already enough attention on an issue. But please don't assume everyone there was acting in bad faith.

To pick up on the specific issue in https://github.com/seleniumbase/resource-files, I do know that "putting specific licenses directly into files was OK". Many of the files contain license info, but not all. You may want to review introjs, that is under the AGPL (according to https://github.com/usablica/intro.js/blob/master/license.md) and this will have implications for any downstream users which they should be made aware of.

And in regards to "my https://github.com/seleniumbase org falls under the special protection of the Software Freedom Conservancy (due to being part of the Selenium umbrella of frameworks)", if SeleniumBase is part of the Selenium project, it should be listed on https://www.selenium.dev/projects/. You may want to check with SFC as to precisely what services they provide to member projects - they provide advice and can undertake license enforcement activities, but I expect that member projects still need to respond to license compliance issues raised by third parties.

pbrkr commented 1 month ago

Also, for the record: I knew nothing of the 4chan thread. That is fucking abhorrant and I'm sorry that you were subject to that.

mdmintz commented 1 month ago

@pbrkr SeleniumBase is listed in the Selenium ecosystem on this page: https://www.selenium.dev/ecosystem/ (They told us we get the benefit of the special protection.)

As for the files in https://github.com/seleniumbase/resource-files and a few other areas, those are downloaded directly from https://www.jsdelivr.com/ (or most of them at least) and saved in case of an emergency situation where the CDN loses its files or goes down completely. The files aren't directly used at all. The repo doesn't really need to exist. When SeleniumBase loads a special resource on a webpage, it grabs the data from the CDN directly. If the License data is missing from the CDN, then it would probably be a problem there first. I haven't updated https://github.com/seleniumbase/resource-files in a few years, due to not being used at all, and probably won't need to be. Could easily be deleted if its existence proved to be a problem.

Kangie commented 1 month ago

Can a repo fork/copy from another repo while removing history?

The actual answer is "it depends", and whether or not it's legal doesn't make it the morally correct thing to do.

I still urge you to use the recipe I provided to restore the original commit history, it'll make life easier in the long run when you want to track down when / why a decision was made.

Can a repo change its license to MIT from something else?

Yes, but you can't change from something more restrictive to something more permissive. It depends on the terms.

Your Microsoft example is explicitly permitted under the terms of the Apache license, whereas LGPL is strong copyleft and does not permit that.

Major companies around the world still depend on it.

And you put these companies at significant legal risk if they were using your forked code in breach of the original license.

pbrkr commented 1 month ago

As for the files in https://github.com/seleniumbase/resource-files and a few other areas, those are downloaded directly from https://www.jsdelivr.com/ (or most of them at least) and saved in case of an emergency situation where the CDN loses its files or goes down completely.

Unfortunately, a lot of files are published on CDNs in a non-compliant way, lacking copyright notices and license conditions. That's a frustration, but not one that me or you can fix completely.

What I would advise is to keep a record, maybe in a readme file, of the upstream projects, versions and links to licenses for each of the subdirectories in https://github.com/seleniumbase/resource-files and the zip files in https://github.com/seleniumbase/SeleniumBase/tree/master/seleniumbase/extensions. Knowing which versions you're distributing also really helps downstream users in case they need to do any security or license compliance of their own.

I still disagree with "For any possible license issues that you may have with SeleniumBase, go directly to the Software Freedom Conservancy" - I consider a licensing issue to be just another bug that needs fixing. I think that project maintainers should handle these issues in the first case, and in your case you may request assistance from SFC under any agreement you have with them if you need it.

alerque commented 1 month ago

The case study you present is in no way an equivalent scenario. The original license in question there was Apache and you have no idea what IP deals go on between those companies. The LGPL is a viral license with copy-left protections that the Apache does not provide. Even ten minutes of reading about difference license types should have tipped you off to these types of differences existing. Likewise the jQuery example you tried to give earlier isn't equivalent because of CLAs that were involved.

Which brings us to the difference between copyrights and licenses, which you conflate throughout this post. They are not the same thing, and different licenses and jurisdictions handle copyright differently. For projects without a CLA, contributors may (or may not depending on the license and jurisdiction) maintain copyright on their contributions even after contributing them to the project. With GPL related licenses for example, each contributor becomes part of the copyright holder (hence why you need signoff from all contributors to relicense, not just project maintainers).

Likewise, "big corp X did it" is not an argument or rational. Big corporations use their muscle and lawyer teams to wrangle their way through illegal moves all the time. If you want to regain any modicum of respect in the FOSS world you should ditch that line of thinking and look around at how FOSS folks see this issue. For example maintaining commit history in a fork is not a legal requirement and nobody said it was. It is strongly preferred and not doing so is a huge red flag culturally. Continuing to justify this by saying it isn't illegal is not going to earn you any respect from people who spend their time contributing to FOSS projects this century.

The ironic thing is I haven't even seen any real motivation for why you wanted to relicense in the first place. Why not just admit it was an ill-advised and poorly researched move that shouldn't have been done? You're not the victim here (only going on GitHub where I've been not 4chan or reddit or whatever else); it wasn't a mob's fault things went south, it was largely the somewhat outrageous things you said when the issue was brought to your attention that scrapped your reputation in the eyes of FOSS community. Even with the original license restored I'm sure a lot of distro maintainers and projects are going continue looking somewhat askance at anything affiliated with you. This long winded attempt to justify and rationalize isn't doing you any favors.

kevingranade commented 1 month ago

If you are working with SFLC SFC I strongly recommend you correspond with them about all of this instead of just namedropping them and saying it's not your problem. It doesn't matter if they "have your back", you're still ultimately on the hook for your behavior, and things like deleting issues bringing up valid licensing concerns in your repositories is a very bad look.

Overall you haven't cleared anything up here except that you are doubling down on sloppily rationalizing your bad actions.

SlowBrainDude commented 1 month ago

Congrats dude! You just won the "destroy your reputation forever in one simple online interaction" price. :+1:

Instead of admitting wrongdoing and honestly asking to being forgiven after being cough committing a criminal offense (license violations are no small issue, people got sued over billions on that ground; just try to distribute a Hollywood movie without license and see what happens next) you're still trying to make a victim out of yourself (even you're the offender!) while finger-pointing to other people and trying to hide behind some whataboutism.

On top of that you're still showing a complete lag of understanding and quite some fundamental ignorance about how the basics of legally binding software licenses work. So you provably learned nothing from this episode here. That being really gross!

Given such vexatious behavior and the complete lack of understanding of the issue one needs to assume that this was not the only offense from your side. All your code shall now be examined. And to avoid further lengthy discussions it would be best if all violations be submitted directly to the lawyers at gpl-violations.org. Some people obviously will only learn the hard way they should not steal from their fellows, and OpenSource licenses are chosen for a reason.

thesamesam commented 1 month ago

(Note that SFLC isn't the same as SFC, but the point I agree with nonetheless.)

duxsco commented 1 month ago

@mdmintz Closing and deleting comments that much here and in https://github.com/mdmintz/pynose/issues/16 may be seen by some as an abuse of your moderator rights. You can always use GitHub for arbitration:

While we are passionate about empowering maintainers to moderate their own projects, please contact us through the GitHub Support portal if you need additional support in dealing with a situation.

Source: https://docs.github.com/en/site-policy/github-terms/github-community-guidelines

mdmintz commented 1 month ago

The GitHub Community Guidelines are stated clearly:

Screenshot 2024-07-16 at 3 47 36 PM

It's up to a moderator's best discretion to decide when it's appropriate to remove comments. If I remove a comment, then it's because I considered that there was sufficient reason to do so.

Sometimes it's because there wasn't a good option when hiding a comment instead:

Screenshot 2024-07-16 at 7 59 43 AM

There's a game called "Two Truths and a Lie", where people say two things that are true, and then one thing that isn't. Then people have to guess which one is the lie. When people apply this game to the real world, they are able to trick people into thinking that the lie is true because the truths were true. It can be quite dangerous if people don't carefully evaluate each point that was made... often times they'll read through quickly and assume that the entire message is true just because the first few things they read were true. This makes it tricky to handle some comments where I can't selectively mark one area as valid and another area as invalid. (Here's a made-up example that simplifies a similar accusation: "The MIT License should have been an LGPL License, therefore this makes you a plagiarist.") In the opening message of this thread, I showed that credit was given, and I also linked to the original repo at the top of the README. I'll reiterate a bit better here:

As many of you have already seen, pynose goes above and beyond acceptable common practices of giving attribution for derived works, as seen from the beginning, where the attribution was at the top of the README.md and __init__.py files:

https://github.com/mdmintz/pynose/commit/5b7314a0a7c1ad61a98fb506dca54a095aa9ad30#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R7

Screenshot

https://github.com/mdmintz/pynose/commit/5b7314a0a7c1ad61a98fb506dca54a095aa9ad30#diff-3fa844e028504aec4be871ebeaffa51082d8b40567cff328b609c374c6d8c44fR9

Screenshot

That's already more than the practice of putting attribution in a NOTICE file (instead of the README), where Microsoft mentioned that Playwright is derived from Puppeteer (one of Google's projects):

https://github.com/microsoft/playwright/blob/3694c1422d9a541776602fb870d0b8e249eff35d/NOTICE

Screenshot 2024-07-16 at 8 01 23 PM

Since people are more likely to see the top of a README file than what's in a NOTICE file, this should clarify that pynose not only gives credit where deserved, but also makes sure that people see the credit.

Also, I'm fairly certain that the majority of people using pynose found out about it because they were looking for a GitHub repo that fixed the unmaintained nose repo. If anyone knows of any pynose users who weren't familiar with the original nose repo, then please let me know. I'd be very curious to learn why they picked pynose over pytest if they never used the original nose in the first place.


Now, onto a more serious matter...

As GitHub Insights had shown, multiple users came over from 4Chan (numbers have grown since the original screenshot posted earlier). There, they made incredibly offensive comments: https://boards.4chan.org/g/thread/101339536, and it gets worse the more you scroll down. To summarize that thread, there were people there who went after me because I'm Jewish. There was also a call-to-action on 4Chan for people to come downvote my posts on pynose, which they did in large numbers. With the help of a photo I posted of me in Jerusalem: https://github.com/mdmintz/pynose/issues/16#issuecomment-2219308445, I was able to identify some of the "Anonymous" 4Chan users based on who downvoted the photo. That led to other places such as Mastodon, where some of the same people making negative comments on pynose also made other negative comments about my "people", (in similar negative fashion as the 4Chan users). Calling out individual accounts probably won't help, so I won't at this time.


"Stay frosty", Python users!

https://github.com/mdmintz

duxsco commented 1 month ago

I didn't question your right to moderate. If you don't want to have GitHub as an intermediary, so be it.

I see the fork button as a double-edged sword.

On the one hand, I like the network graph and the list of forks. You can sort the list based on stars, updates, open issues and open forks. This helps with finding potential successors of an orphaned repo, while the reference to upstream in your README.md comes short in this regard. As already stated by others, it's up to you whether to make use of the fork button.

On the other hand, I don't like the implications that come with the deletion of public repositories. Hopefully, people take note of such a change. I don't know whether such an action is explicitly pointed out by GitHub.

The photo itself just confused me. Thus, I clicked on the confused emoji provided by GitHub: grafik

In general, I like to keep politics and religion out of coding. Bringing it up doesn't help.

mdmintz commented 1 month ago

@duxsco I brought up the photo, https://github.com/mdmintz/pynose/issues/16#issuecomment-2219308445, only after I saw on social media / chat forums that many people were coming after me specifically because of my religion (Judaism) and because I support Israel. All those downvotes against me (for that specific photo) helped clarify the situation to bystanders. The post-mortem also helped clear up some confusion.

GabrielRavier commented 1 month ago

I don't think downvotes against that comment were necessarily caused by people being antisemites - it was probably a factor to those that came from places like 4chan, but I think many others simply saw the post as being randomly off-topic and might have interpreted it as, say, a really poor attempt to distract from the issue at hand. I would personally guess that the number of downvotes would not have been significantly lower if the flags in the image had been of e.g. Palestine.

nedbat commented 1 month ago

@mdmintz let's not take the 4chan bait and get distracted onto off-topic political issues. 4chan is entirely trolls, and I'm sorry you suffered abuse by them, but we don't have to follow their lead.

Instead, can you explain why you created the repo they way you did? A number of people have asked about this, and you haven't given an answer. Instead of the easy path (click the Fork button), you took a more difficult path that involved copying a subset of files into a new repo. This lost git history and made it more difficult to contribute your changes back in the future. Why would you do that? More baffling to me is why you dropped the test suite in the process? Shouldn't this library have a test suite?

This is an unusual way to make a fork, and looks suspicious. Perhaps you have good reasons, but you haven't explained.

I know you have said that you are not legally required to keep the git history. That is true. But it's really odd to do it the way you did. Help us understand.

mdmintz commented 1 month ago

@nedbat Excellent questions. As you know, I'm in the web automation space, and so I spend a lot of time studying web automation frameworks and repositories. One of the more recent repos I studied was Microsoft's Playwright. I was aware of how history was reset when Playwright was created from Google's Puppeteer. Seemed like a nice way to keep a framework lightweight for the next generation (and if history could be reset like that, then why would there be a double-standard on individual developers doing the same thing?). That is probably how I started with pynose - by getting nose from https://pypi.org/project/nose/1.3.7/#files, which wouldn't have the Git history included.

With the repo, (and using Python 3.11), I ran:

flake8 --statistics --count

This led to 1263 issues, summarized with the following categories by count:

4     E101 indentation contains mixed spaces and tabs
182   E111 indentation is not a multiple of 4
22    E114 indentation is not a multiple of 4 (comment)
1     E116 unexpected indentation (comment)
9     E122 continuation line missing indentation or outdented
1     E124 closing bracket does not match visual indentation
1     E125 continuation line with same indent as next logical line
5     E127 continuation line over-indented for visual indent
10    E128 continuation line under-indented for visual indent
5     E129 visually indented line with same indent as next logical line
13    E201 whitespace after '['
12    E202 whitespace before ']'
2     E221 multiple spaces before operator
1     E222 multiple spaces after operator
21    E225 missing whitespace around operator
15    E231 missing whitespace after ','
64    E251 unexpected spaces around keyword / parameter equals
36    E261 at least two spaces before inline comment
1     E262 inline comment should start with '# '
17    E265 block comment should start with '# '
1     E275 missing whitespace after keyword
28    E301 expected 1 blank line, found 0
203   E302 expected 2 blank lines, found 1
17    E303 too many blank lines (3)
40    E305 expected 2 blank lines after class or function definition, found 1
45    E306 expected 1 blank line before a nested definition, found 0
1     E401 multiple imports on one line
77    E501 line too long (80 > 79 characters)
3     E502 the backslash is redundant between brackets
2     E701 multiple statements on one line (colon)
3     E711 comparison to None should be 'if cond is None:'
3     E712 comparison to True should be 'if cond is True:' or 'if cond:'
8     E713 test for membership should be 'not in'
3     E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`
19    E722 do not use bare 'except'
6     E741 ambiguous variable name 'l'
85    E999 SyntaxError: invalid syntax
47    F401 'sys' imported but unused
5     F403 'from mypackage.math.basic import *' used; unable to detect undefined names
6     F633 use of >> is invalid with print function
4     F811 redefinition of unused 'debug' from line 24
13    F821 undefined name 'cmp'
13    F841 local variable 'a' is assigned to but never used
4     W191 indentation contains tabs
35    W291 trailing whitespace
1     W292 no newline at end of file
128   W293 blank line contains whitespace
21    W391 blank line at end of file
20    W605 invalid escape sequence '\d'

And so began the very slow process of fixing each line, line-by-line, on my own. Unfortunately, the tests that came with the repo were not in good shape. The quick flake8 scan showed 251 issues for the unit_tests folder, and 321 issues for the functional_tests folder. The deeper look revealed that these tests were mainly only designed with Python 2.7 in mind. (I wanted to make sure that 3.6 would be the minimum Python version, which today is Python 3.7.) With a total of 572 flake8 issues for the test folders, and poorly-designed, hard-to-read, inoperable, inefficient, and unreadable tests themselves, the easiest way to do testing would be to throw out all the tests that came with nose, and replace them with the SeleniumBase test suite. As not everyone may realize, seleniumbase has a lot more dependencies than nose, and so in order to do unit-testing for nose, they would have to be done as part of SeleniumBase testing. As some people may be surprised, this worked out really well: nose testing via SeleniumBase was able to identify all known bugs so that they could be fixed:

And of course, there was a lot of major refactoring to fix the remaining 650+ flake8 issues that could be found throughout the code. Very tedious, but eventually I fixed them all! It was quite the mess getting everything organized, and some of it was probably done into the early hours of the morning on some days, (sort of like this response I'm typing right now). Therefore, I probably won't remember all the details, as pynose had it's first release over a year ago.

Keep in mind, pynose is one of several open-source projects that I've worked on. The largest of my open-source projects is seleniumbase (this should come as no surprise). And as you may recall, seleniumbase was first presented to a large public audience at Boston Python in February 2016. SeleniumBase has a lot of dependencies, which includes pynose. When the original nose stopped working, I wanted to keep the legacy alive. I looked around to see if anyone had already fixed it, but I couldn't find what I was looking for. I found nose2, but that wasn't backwards/forwards compatible with nose. I did find a lot of people who were looking for a fixed nose though. Seemed like people either didn't have time to fix it, didn't want to fix it, or didn't know how to fix it. Then came me: I knew how to fix it, and I already spend a lot of time working on open-source, so naturally I had to be the one to step in, fix all the flake8 issues, fix the bugs, make the seleniumbase tests work, and everything else that was needed to get these fixes out to people.

OK, getting late here and I need to get some sleep before work in the morning. Hopefully that answered your questions!

duxsco commented 1 month ago

I didn't know the 4chan people had crossed over before reading the post-mortem. IMHO, you played into their hands by posting the photo and added to the confusion the 4chan people were already apparently causing for bystanders like me.

With the help of a photo I posted of me in Jerusalem: https://github.com/mdmintz/pynose/issues/16#issuecomment-2219308445, I was able to identify some of the "Anonymous" 4Chan users based on who downvoted the photo.

I see such an attempt to allocate haters as an abuse of the comment functionality. It would have been better to get GitHub involved instead.

If it had been kitten pictures, for example, I would have downvoted it straight away. But as it seemed to be about politics and religion, I opted for the confused emoji instead.

nedbat commented 1 month ago

Seemed like a nice way to keep a framework lightweight for the next generation

It doesn't keep the framework lightweight, since the delivered code is the same. It just makes it hard to understand the history, and hard to contribute back.

(and if history could be reset like that, then why would there be a double-standard on individual developers doing the same thing?).

Playwright was meant to be a new thing unrelated to the old Puppeteer. You had a different goal: you've said you wanted to contribute your fixes back to nose. That's really difficult now because of the choice you made, so it will likely never happen.

Thanks for keeping nose working, but I think it's yours forever now.

mdmintz commented 1 month ago

@nedbat, as you mentioned earlier: PSA: nose is unmaintained. Use pytest or plain unittest instead.

Screenshot 2024-07-18 at 2 31 57 PM

That, I can agree with. Newcomers that want to write unit tests for Python should probably look into using pytest. (https://github.com/pytest-dev/pytest) It's also the primary test-runner for my SeleniumBase framework, which is one of the most popular users of pytest according to GitHub Topics for the "pytest" category: https://github.com/topics/pytest.

Because of this, I didn't have to take on the challenge of reviving nose as pynose (think of Gandalf the Grey vs Gandalf the White from Lord of the Rings) but I revived it anyway because many companies still depended on it. I gave them pynose, and anyone can use it. I'm just the maintainer.

Screenshot 2024-07-18 at 1 51 09 PM