PlummersSoftwareLLC / Primes

Prime Number Projects in C#/C++/Python
https://plummerssoftwarellc.github.io/PrimeView/
2.46k stars 575 forks source link

Would anyone like to Admin this project? Someone knowledgeable in the ways of Github? #42

Closed davepl closed 3 years ago

davepl commented 3 years ago

If so, please chime in and volunteer! I wasn't really anticipating the amount of interest and I've got 30+ pull requests and I'm swamped just doing the benchmarking and making the videos, but if someone would like to take over this project and maintain the source to the Drag Racing series, I'd love that!

My goal is to be able to keep the original branch with only a very few changes. I have a few checking queued up of my own, which I will check in, and then leave the main branch alone. I want as few changes in the main branch as possible to keep the tests similar from one episode to the next, but we can party alongside it for sure!

But if folks want to add side-projects for ports to other languages (I know folks have already submitted D, Rust, Java, and others!) the can go in folders alongside, like PrimeJava.

If you're interested, please let me know! Ideally with a one or two snippet of your experience with GitHub!

Thanks! Dave

JL102 commented 3 years ago

I'm not an expert at managing GitHub repos, so I wouldn't be that great for the job; but I do really hope that this repo continues to get activity. I would love for people to keep iterating and improving, and adding new languages, so that it can demonstrate the best that each language has to offer.

It's quite impressive how much performance can be gained by optimizing one's code - I'm extremely impressed by the one who got pure Python (without JIT compiling) to run several thousand iterations. I also, out of curiosity, tried out my own JS implementation with standard (dynamic-sized, loosely typed) arrays instead of a UInt8Array, and it was 43 times slower. (152 iterations instead of 6800). Also, @rhbvkleef's note about getting rid of the getBit() and clearBit() methods and putting the logic inline led to a more than 2x speed bump in my code. In general, this is teaching me a lot about performance optimizations in code. I hope that people come in with more languages. I'm almost tempted to try a Brainfuck implementation, even though I've never written in Brainfuck. xD

rhbvkleef commented 3 years ago

I'm swamped just doing the benchmarking and making the videos

I can imagine. I think it´s really cool, how much interest appeared.

I think looking at reviewing and merging requests is a good idea. I have a few ideas on what can be done to streamline this process, but I am still quite unsure about a few ideas. The main problem I see is that it is desirable to allow multiple implementations per language, and the requirement to be able to run on multiple platforms (maybe, or should Windows become a hard-requirement?)

As such, I´ve made an initial draft on what I would do if I were allowed to maintain this project. I would be quite happy to implement this plan, and accept suggestions on this plan. As per my GitHub experience, I´m not sure I can demonstrate very much. I do manage the repository and CI on https://github.com/StichtingIAPC/SteamBird, but that is about the extent of my demonstrable experience with GitHub. I do also volunteer development for a Point-of-Sale system for a foundation, and was the technical lead of that project for 2 years, but this is on a private instance of GitLab. I also use GitLab extensively at work. Even though I have little demonstrable experience with GitHub, I do think I have enough skills and knowledge to take up this task.

Directory structure

I considered several options, and I think allowing each individual user to contribute their implementation is a desirable approach. These implementations are then ranked in different categories:

  1. stl-faithful: Only using the STL of the language, implemented faithful to the original implementation (using approximately the same class-structure/API or language-equivalent)
  2. stl-unfaithful: Only using the STL of the language, not faithful to the original implementation
  3. dependencies-faithful: Using additional dependencies of the language, implemented faithful to the original implementation (using approximately the same class-structure/API or language-equivalent)
  4. dependencies-unfaithful: Using additional dependencies of the language, not faithful to the original implementation
  5. native: Native implementation, available through FFI

Then, I think a directory structure like the following is desirable:

Once such a structure is established, we can use it to make rankings, and compare between different implementations and languages easier.

Cooperating with an implementation

Any user will be allowed to copy another user´s code to their own directory, and tune/modify it. Additionally, a user can submit a merge-request adjusting another user´s implementation, but if that´s the case, the original user is required to approve of the MR before the MR can be merged.

Licensing

The current repository does not have a specified license, and neither do any of the implementations currently open for merge. I don´t think this is a problem, but I think it is still a good idea to add a license, to make sure that there will never be problems regarding it. I think any FOSS license is acceptable, but my personal suggestion is BSD-3.

Review process

Code submitted in a merge request MUST conform to the following requirements:

The reviewer should do the following actions:

The Drag Race

I think it would be incredibly fun to have regular races comparing every implementation. Because it is important that all the trials are run on identical hardware, I don´t think it is trivial to automate this (as I´m not sure about the performance-consistency of GitHub Actions). Periodically someone should run a drag-race and tabulate the results. As I anticipate this will become a significant amount of work, this should be automated in some way.

Because I suspect contributors to this repository run on a healthy mix of Windows, Linux (and maybe Macs), this is not trivial, as the required compilers and runtimes are different on each platform, and also the shells are different (at least on Windows). I think it is advisable to simply overcome this problem on a language-by-language basis.

supercheetah commented 3 years ago

As a rando on the internet that hasn't updated his own repos in a few years, I nominate @rhbvkleef .

rcmaehl commented 3 years ago

While I only have 1-2 large repos I don't have any community contribution but @micwoj92 has been a big help and is in all sorts of repos.

flostellbrink commented 3 years ago

@rhbvkleef Sounds like a great plan!

I think #49 would help quite a bit with the review process. Of course not quite enough for the drag race, cause of changing hardware as you mentioned.

micwoj92 commented 3 years ago

While I only have 1-2 large repos I don't have any community contribution but @micwoj92 has been a big help and is in all sorts of repos.

Thanks for mentioning me. I was never admin to any GitHub repo so I don't think I qualify. I will just watch this repo and when there's something that I can help with I will. Also @rhbvkleef already has really good plan so I vote for him.

rhbvkleef commented 3 years ago

@rhbvkleef Sounds like a great plan!

I think #49 would help quite a bit with the review process. Of course not quite enough for the drag race, cause of changing hardware as you mentioned.

That's really useful! I agree that it might be quite useful.

Turnerj commented 3 years ago

A lot of good ideas @rhbvkleef . One thought I had with that directory structure though is that it could get noisy with very similar implementations just under another person's name. I think simplifying it down to the categories might be better.

In this structure, it would mean any language could have at most 5 implementations (the original Dave wrote, stl-faithful, stl-unfaithful, dependencies-faithful, dependencies-unfaithful). If someone is wanting to make a particular language's implementation faster, they would more directly build upon the efforts of others rather than cloning the efforts. This also removes the case that the original person that did the change needs to approve the merge request - you would ideally just want someone who knows the language to verify it is in the right category.

Really, the only requirement that should stop a change being merged is whether it doesn't improve performance or the change doesn't fit in the specific category mentioned.

With this in mind (and with #49), the review process for changes should be drastically reduced. It would amount to checking the results of the CI build and clicking the merge button if the changes are valid for the category.


As for an admin for this project, I think you'll (@davepl ) probably want more than one. With the variety of languages that might end up being submitted, you'll probably want a wide cross section of people who work with those languages. Definitely think @rhbvkleef sounds like a good candidate though!

I'd be happy to help out - I've got a bunch of experience with GitHub through multiple repositories though programming-language wise, I'd probably only be of most help for C#.

davepl commented 3 years ago

I think we will need more than one long term, particularly with all of these languages. I propose we start with one, offering it to rhbvkleef who had the most solid plan, and start there! If anyone else has a strong urge and feels left out, let me know, I'm sure an extra hand or two will be welcome.

Now to see if I can figure out how to make him an admin!

micwoj92 commented 3 years ago

Click on Settings tab, then Manage Access and Invite a collaborator.

rhbvkleef commented 3 years ago

I have just received an email from Dave, inviting me to administrate this repository. I´m eager to get started 😄 . I´ve just accepted the admin invitation.

I think we will need more than one long term, particularly with all of these languages.

I agree. There are a lot of contributions to sift through, and decisions to be made, and I hope I can ask some people to share their knowledge to do so.

I propose we start with one, offering it to rhbvkleef who had the most solid plan, and start there!

Thank you :)

If anyone else has a strong urge and feels left out, let me know, I'm sure an extra hand or two will be welcome.

I do indeed think that a couple more hands would be very welcome.

As for implementing the plan I proposed, I think I will start either Tuesday or Wednesday. I am currently away from my computer quite a lot due to Easter.

ciplogic commented 3 years ago

I had been admin to Git Hub projects and I did administer PRs in my past and excluding I have no Mac, I can even do run all the rest of tests and make a graph once a week on Windows and Linux on Zen 2 based architectures.

So if you feel like I can help, add me as a helper person with write access.

I have a small child, so I will not be helping on every turn...

dswij commented 3 years ago

A lot of good ideas @rhbvkleef . One thought I had with that directory structure though is that it could get noisy with very similar implementations just under another person's name. I think simplifying it down to the categories might be better.

* <language A>

  * original
  * stl-faithful
  * stl-unfaithful

* <language B>

  * dependencies-faithful

Just wanted to add that some categories might also have different implementations, so the disadvantage with this is that we might not be able to compare these.

@Turnerj has a good point, but I think what might be sensible is to limit each category to some arbitrary amount based on the speed rankings (top 5, top 10, etc.). If someone else wants to build upon the work of others, it is fine as long as it improves the speed (or credit the original author in some way).

@rhbvkleef Sounds like a great plan!

I think #49 would help quite a bit with the review process. Of course not quite enough for the drag race, cause of changing hardware as you mentioned.

I think what can be done to improve #49 is to run the benchmark multiple times and take the average running time (or maybe something else), so we have more confidence based on the number of sample. (Also, maybe on different architectures there will be performance differences? but this might be a different discussion)

If there is any lack of hand, let me know and I will be happy to lend one :)

rhbvkleef commented 3 years ago

Just wanted to add that some categories might also have different implementations, so the disadvantage with this is that we might not be able to compare these.

I think this is a good point. I am already quite swamped with categorizing all MR's, and I think it will be an immense job to deal with merging everything, especially considering the code-style fixes. I think it would make my life a lot easier if we keep it to either one implementation, or one for each user.

I think what can be done to improve #49 is to run the benchmark multiple times and take the average running time (or maybe something else), so we have more confidence based on the number of sample.

I agree, that might give us a more reasonable bench. For the final drag-race, we will have to do it on a single machine. Where possible, we should define docker-containers to run the tests in, so that we can easily run them in CI and locally.

But, let's move any further discussions on this topic to #66.

rhbvkleef commented 3 years ago

My goal is to be able to keep the original branch with only a very few changes. I have a few checking queued up of my own, which I will check in, and then leave the main branch alone.

@davepl I see you've been merging in several implementations. What exactly is the plan w.r.t. this? I expected the plan to be that there would be the main branch with just your code, and then we would create a different branch (propose: drag-race) where contributions would be merged into. Are we now going to use the main branch anyway, or is this still the plan?

davepl commented 3 years ago

I think I've only touched the C and C_PAR folders, and have played with but not modified others.

I'm happy to plug away in a little branch of my own... a drag-race branch works for me. I've never actually forked a branch before so if you want to do so and then send me instructions on "touch this stuff, but don't break this branch here", I'll try to follow the rules as best I can!

LanceMcCarthy commented 3 years ago

We should also set up some stable custom runners so that the drag race result numbers are more reliable.

Personally, I'd be happy to help and host a custom Linux, Mac OS or Windows runner for this repo. I have a bunch of experience with GitHub/ADO/GitLab CI-CD experience for .NET and Node builds and can help here). We just need to make sure we do not have PR triggered runs (from a security and stability standpoint).

@rhbvkleef has a good idea on having a protected drag-race branch that will trigger the builds. Only collaborators and owners can do the merge into that branch. (and prevent @davepl from breaking it :D )

My credentials and trustworthiness:

smiliea commented 3 years ago

Hey Dave!

I am not good with Git either. I am an old C programmer who started (and still uses) C++ Builder. I don't know how to push code to Git, but created my own fork and pushed my C++ Builder code to it! It may be interesting to compare the Embarcadero code to MS C++ vs. MinGW. I bet the compilers do a pretty good job optimizing for speed.

Thanks,

Andrew

On Wed, Apr 7, 2021 at 7:15 AM davepl @.***> wrote:

I think I've only touched the C and C_PAR folders, and have played with but not modified others.

I'm happy to plug away in a little branch of my own... a drag-race branch works for me. I've never actually forked a branch before so if you want to do so and then send me instructions on "touch this stuff, but don't break this branch here", I'll try to follow the rules as best I can!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/davepl/Primes/issues/42#issuecomment-814865490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF2ATVDF5ZWSKH2DQOCW5QDTHREEZANCNFSM42JBVLNQ .

rhbvkleef commented 3 years ago

I am getting swamped with the amount of work to get everything to work, and next to my own job, I´m not getting as much done as I wanted, and I´m struggling with making some decisions. As such, I want to get a second maintainer in. I have several nominations:

If anyone else is interested, please let me know. At this point, I expect I´ll ask @LanceMcCarthy because he seems to be the most qualified, but again, if I´m wrong, I wouldn´t mind being corrected. I will be making the choice on Saturday 16:00, UTC.

As per my plans right now. I plan on pushing forward on my original plan of keeping each user´s contribution separate for now. The discussion is not yet out on whether we should change this, and which implementation we should use, and once discussions are enabled, I plan on finishing that discussion, and potentially merging contributions (as suggested by either @Turnerj or @dswij).

The current progress is:

The main thing that we currently need is:

So there´s a lot to do.

marghidanu commented 3 years ago

@rhbvkleef I can also help

mike-barber commented 3 years ago

It looks like we've got quite a few new PRs arriving every day, and there's quite a bit of work to do. The curse of success!

One thing that might reduce the amount of work required is a to update the README in main, and possibly add a very small CONTRIBUTING.md file there too.

For new contributors arriving, there's no guidelines on where to target stuff, so the natural thing is to just copy the existing structure in main rather than being guided to the new structure in drag-race. Other things that could be useful in this are

Doesn't have to be perfect right now, but I think it'll help to reduce the admin workload: PRs should be closer to correct coming in the front door.

Happy to help where I can, too!

JL102 commented 3 years ago

One thing that might reduce the amount of work required is a to update the README in main, and possibly add a very small CONTRIBUTING.md file there too.

For new contributors arriving, there's no guidelines on where to target stuff, so the natural thing is to just copy the existing structure in main rather than being guided to the new structure in drag-race. Other things that could be useful in this are

  • 5s run (or 10s; but we should pick one)
  • include a README in your solution

    • with instructions on how to build/run it
    • state how close to the C++ solution is, or
    • state how the solution is particularly idiomatic in your chosen language
    • note contributors
  • include a run.sh in your solution
  • include basic tests to ensure that the solution actually works (maybe)

Doesn't have to be perfect right now, but I think it'll help to reduce the admin workload: PRs should be closer to correct coming in the front door.

I think that's a very good idea. Having guidelines would be quite helpful, so that it reduces the workload of admins/mods.

  • @JL102 you´ve been very helpful with the CI situation, and I´ve seen you out and about.

I think you may have mixed me up with @marghidanu - They're the one who talked about CI 🙂 But in any case, I can try to help where I can (though at the moment, finals are nearing, so I can't spend too much time).

rhbvkleef commented 3 years ago

One thing that might reduce the amount of work required is a to update the README in main, and possibly add a very small CONTRIBUTING.md file there too. For new contributors arriving, there's no guidelines on where to target stuff, so the natural thing is to just copy the existing structure in main rather than being guided to the new structure in drag-race. Other things that could be useful in this are

  • 5s run (or 10s; but we should pick one)
  • include a README in your solution

    • with instructions on how to build/run it
    • state how close to the C++ solution is, or
    • state how the solution is particularly idiomatic in your chosen language
    • note contributors
  • include a run.sh in your solution
  • include basic tests to ensure that the solution actually works (maybe)

Doesn't have to be perfect right now, but I think it'll help to reduce the admin workload: PRs should be closer to correct coming in the front door.

I think that's a very good idea. Having guidelines would be quite helpful, so that it reduces the workload of admins/mods.

I also agree. Some discussion was held in #80 for the solution-specific README´s. The addition of a CONTRIBUTING and updating of the README in main is on the TODO list, but I should probably make that a priority, to make sure that new PR´s arrive in a somewhat merge-ready state. Let´s move this complete discussion to #80 .

  • @JL102 you´ve been very helpful with the CI situation, and I´ve seen you out and about.

I think you may have mixed me up with @marghidanu - They're the one who talked about CI slightly_smiling_face But in any case, I can try to help where I can (though at the moment, finals are nearing, so I can't spend too much time).

Ah, I´m not surprised that my notes got mixed up a bit. Sorry for that 😬

rhbvkleef commented 3 years ago

Having observed the repository that last couple of days, I think I need someone that can help me make decisions most urgently. As such, I am inviting @marghidanu to help me with finishing the ruleset, deciding on how to deal with different implementation strategies, and how to guide the community to make contributing easier and more streamlined. I hope he can be of help! @marghidanu would you like to have a chat with me (text or otherwise) about how to proceed?

marghidanu commented 3 years ago

@rhbvkleef found you! Ping me when you have time!

rbergen commented 3 years ago

Closing this, as we seem to have a stable group of maintainers.