stdlib-js / google-summer-of-code

Google Summer of Code resources.
https://github.com/stdlib-js/stdlib
23 stars 5 forks source link

[RFC]: develop C implementations for base special mathematical functions #41

Closed gunjjoshi closed 2 months ago

gunjjoshi commented 3 months ago

Full name

Gunj Joshi

University status

Yes

University name

Indian Institute of Information Technology, Kottayam, Kerala

University program

Bachelor of Technology in Computer Science and Engineering

Expected graduation

May 01, 2026

Short biography

I am currently a second-year undergraduate and will be entering my third year by the time the coding period starts. I am pursuing a B.Tech degree in Computer Science and Engineering. My technical competencies include C/C++, JavaScript, Python, AWS and several other web technologies, along with a good amount of knowledge and experience with Docker and Firebase.

My coursework includes a wide variety of subjects covering Computer Organization, Computer Networks, Data Structures, Database Management Systems, Operating Systems, Design and Analysis of Algorithms, and so on.

Some of my general interests include contributing to open-source projects, learning new technologies and developing backends for web applications and other software.

Timezone

Indian Standard Time ( IST ), UTC+05:30

Contact details

email: gunjjoshi8372@gmail.com, joshi22bcs24@iiitkottayam.ac.in

Platform

Mac

Editor

My preferred code editor is Visual Studio Code. This is because, based on my experience so far, I find VS Code to be the smoothest and most user-friendly. Also, its merge conflict resolver and availability of a large number of extensions feels great.

Programming experience

Apart from my contributions to stdlib, my programming experience consists of a wide variety of personal as well as group projects, open-source contributions, competitive programming, and so on, covering a wide variety of technologies such as Next.js, Node.js, React.js, Tailwind CSS, C/C++, Python, and JavaScript. Some of the projects I've enjoyed working on include the following:

JavaScript experience

My most recent experiences with JavaScript include my contributions to stdlib, such as adding new packages, adding C implementaions for existing mathematical functions, and refactoring some of the blas/ext/base packages to align with current project conventions. Alongside my JavaScript work at stdlib, I've also utilized JavaScript extensively during my coursework and have completed numerous projects using JavaScript and its frameworks.

One of the best aspects of JavaScript is its rich ecosystem of frameworks. It can be used for creating awesome libraries like stdlib, great frontends using React.js, robust backends using Node.js, and much more!

However, one downside of JavaScript, in my opinion, is that there isn't as much seriousness or interest among most people for it outside of web development. But this perception is gradually changing, especially with projects like stdlib proving that mathematical calculations and visualizations can be efficiently done with JavaScript.

Node.js experience

While working with various web applications and other projects, I have gained extensive experience with Node.js. Not only have I worked with its core functionalities, such as writing server-side code and handling database connections, but I have also tackled tasks like file handling and more.

C/Fortran experience

The first programming language I learned was C during high school, where I created small command-line projects. Building on this foundation, I further developed my skills during my undergraduate coursework and have since made significant contributions to stdlib, which includes adding C implementations for various mathematical functions originally written in JavaScript. Additionally, I have a basic understanding of Fortran.

Interest in stdlib

Since the time I got to know about stdlib, I was very excited to see how things are working under the hood. The whole idea of, building a mathematical library for JavaScript seemed great. Initially, I thought,

But why do we need this?

As I went deeper and deeper into the codebase, talked with the maintainers, I got my answer -

What if we want to use those complex mathematical functions in the browser ? Without relying on other Python libraries ?

And that was when, I developed a huge interest in stdlib. The core idea behind this, to have all the numpy and scipy functionalities straight into JavaScript, in our browsers, amazed me. The organization of the entire project impressed me greatly. Even contributions made in remote corners of the project can seamlessly propagate to other repositories automatically. The approach of employing JavaScript implementations for smaller, lighter functions and C implementations for larger and complex ones struck me as ingenious.

The mentors are very helpful, and so are the fellow contributors. This, in turn, makes a great community. The whole process of learning, discussing and implementing felt great to someone like me.

Version control

Yes. I have a great amount of experience with GitHub.

Contributions to stdlib

Merged Pull requests : https://github.com/stdlib-js/stdlib/pulls?q=is%3Apr+author%3Agunjjoshi+is%3Amerged+ Open Pull Requests : https://github.com/stdlib-js/stdlib/pulls/gunjjoshi Issues : https://github.com/stdlib-js/stdlib/issues?q=is%3Aissue+author%3Agunjjoshi+

Goals

After successfully completing this project, C implementations for math/base/special functions, both for single precision and double precision, will have been added. Additionally, some additional packages, including those which involve complex numbers will also be included for both single precision and double precision. This will represent a significant step towards achieving parity with numpy and scipy. Furthermore, existing implementations that have received bug fixes in their upstream implementations will also be updated.

In addition to these enhancements, work on automation and scaffolding will be undertaken to automate specialized package generation. This will streamline the process of extending the C implementations of math/base/special packages to include strided and ndarrays encapsulated within them.

Why this project?

What excites me about this project is that we're building something for a large level and audience. Having a JavaScript library, with numpy and scipy parity, is not a small thing ! I, while working on certain projects, personally wished if we could have something like numpy or scipy, using which we can use those complex mathematical functions directly, in a web application, or in the browser, rather than relying on other Python libraries.

More specifically, if I say, about the project develop C implementations for base special mathematical functions, it gives me immense excitement to work on this. The reason being, I will be working on one of the most important things out there. This is because these C implementations are essential for optimizing the performance of our existing JavaScript functions. I believe this initiative has the potential to be a game-changer.

Along with that, the idea of working with a very supportive and insightful community of stdlib, both mentors and fellow contributors, is what excites me the most.

Qualifications

During the course of this project, I will be utilizing C, JavaScript, and Node.js. As mentioned earlier, I have acquired sufficient knowledge and hands-on experience with these technologies. Throughout my high school and undergraduate coursework, I have consistently delved deeper into the workings of C and JavaScript.

Additionally, I have developed several web applications with Node.js, which have provided me with a solid understanding of the platform.

My contributions to stdlib have further added to my experience with C, JavaScript, and Node.js. As a result of all these factors, I believe, I possess the technical know-how required for this project.

Prior art

For this project, some of the work has already been started. The C implementation for some of the math/base/special functions has already been done, as given here. This progress provides us with a foundation to build upon rather than starting entirely from scratch. The ideation of the automation and scaffolding part has also been started here, and we can soon have a final plan and implementation for that too.

Commitment

This, in turn, sums up to 340 hours in total.

Apart from this, I would aim to get started with the work in the bonding period itself, to ensure that we are always a bit ahead of the intended schedule. I don't have any other commitments for the entire duration of the program, which would enable me to dedicate my whole time for this project.

Implementation Plan

Schedule

Assuming a 12 week schedule,

Moreover, I would be making Pull Requests after the implementation of each package, so that we don't need to have the burden of reviewing a large chunk of code at the last moment. I do fully understand that reviewing those PR's from time to time might take a significant amount of time. During that time, I will be working with #1147 in parallel.

In total, I will be aiming for a default 12-week timeline, during which I would be working on this 350 hour project. If we are left with some more time towards the end, we can work in some more depth with strided arrays and ndarrays.

Notes:

Checklist

gunjjoshi commented 3 months ago

Any suggestions on this ? cc: @kgryte @Planeshifter @Pranavchiku

Pranavchiku commented 3 months ago

Hey @gunjjoshi , iterating over your proposal these are few things that I feel can enhance your proposal

Incorporate the changes and we can iterate over it again, thank you!

gunjjoshi commented 3 months ago

Thanks for reviewing my proposal @Pranavchiku. I will incorporate the suggested changes.

gunjjoshi commented 3 months ago

I have updated this based on your suggestions @Pranavchiku. I have restructured my timeline according to that. You can have a look now !

kgryte commented 3 months ago

Thanks, @gunjjoshi, for sharing your draft proposal. A couple of comments:

  1. For reference implementations, we are not limited to Cephes and FreeBSD. We also draw from Golang, Boost, Julia, FDLIBM, and Slatec. In short, we mix and match based on how we evaluate ease of implementation and general accuracy/performance. For a few of our implementations, we'd likely want to swap out Cephes for something else (e.g., FreeBSD or OpenLibm), as Cephes implementations may be a bit dated at this point.
  2. You may want to spend some time mapping out the dependency graph, so that you know what order double-precision implementations need to be implemented in. E.g., rempio2 is a prerequisite for sin, cos, etc. And as each implementation will be a PR, you'll want to ensure that you plan work out such that you minimize being blocked. E.g., while rempio2 is being reviewed, maybe you're able to continue working on other implementations which are not dependent on rempio2.
  3. Some APIs will have design considerations that need to be addressed. For example, some of the gamma functions are currently variadic, but should be refactored to be non-variadic, as we won't be supporting variadic interfaces in C. This will mean also updating call sites throughout the project accordingly.
  4. For complex number base math functions, I suggest consulting C99: https://en.cppreference.com/w/c/numeric/complex. We'll want to ensure we have full coverage for those APIs.
  5. Similarly, if we are missing functions from IEEE 754, those would be good to add, as well: http://www.dsc.ufcg.edu.br/~cnum/modulos/Modulo2/IEEE754_2008.pdf. One, in particular, that comes to mind is remainder.
  6. If you are blocked, you can always work on stdlib #1147 in parallel.
gunjjoshi commented 3 months ago

Thanks for the suggestions @kgryte. I will be making a dependency graph for the packages that we currently have in math/base/special, and will incorporate it in the proposal. Along with that, I will modify the proposal as per your remarks.

gunjjoshi commented 3 months ago

@kgryte, some packages, such as math/base/special/cceilf, have incomplete C implementations. They do have include and src directories, but no native.js. We will be completing their implementations too, right ? Or is this intentional ?

gunjjoshi commented 3 months ago

I've added the following package dependency graph.

stdlib-package-dependencies.drawio-3.pdf

This graph contains all packages that currently do not have their C implementations, in math/base/special.

Are there any changes that I can make now ? Also, in the contributions section, according to you, what would be better, listing out all the contributions one by one, or just having links to all of them, as we have now ?

cc: @kgryte

Pranavchiku commented 3 months ago

Just to make things easy for yourselves, you may sort the graph in topological order, this will give you a list on how you shall proceed towards implementing C functions.

Also, you may incorporate math/base/assert/* APIs.

kgryte commented 3 months ago

@gunjjoshi That graph is nice. Thanks for generating. And agreed with @Pranavchiku on performing a topological sort. You can potentially leverage @stdlib/utils/compact-adjacency-matrix for this, which provides a toposort method. We actually also have tooling for this: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/_tools/pkgs/toposort.

kgryte commented 3 months ago

You should be able to use the pattern option to sort only the packages in math/base.

gunjjoshi commented 3 months ago

Thanks for the topological sort idea @Pranavchiku @kgryte . I tried to work around with https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/_tools/pkgs/toposort, but couldn't figure out how to run it. Tried running from command line, but can't figure out the exact commands. Thought to use it from npm, but there too, I couldn't find @stdlib/_tools. There is an option for command line interface usage too, but do we have some example uses or references from where I can see how can I use it ?

kgryte commented 3 months ago

From the root stdlib directory,

node ./lib/node_modules/@stdlib/_tools/pkgs/toposort/bin/cli $PWD/lib/node_modules/@stdlib/math/base
gunjjoshi commented 3 months ago

That actually worked. Thanks @kgryte. So, based on the output, shall I list out the packages sequentially too, on which I would be working, in the proposal ? That would be a bit lengthier, but will give some clarity. Or else better to use it while actually implementing them, and including just the dependency graph in the proposal ?

kgryte commented 3 months ago

You can use it to help guide your timeline in terms of which higher priority APIs need to be tackled first before moving on to others. In your final proposal, I think it is enough to simply reference the above command and state that you'll also be using it as a guide for what order you'll need to implement single-precision APIs.

Pranavchiku commented 3 months ago

In your free time you may work on developing a script to add the chore files required in a package. Also, for benchmarks, tests as this is mostly copied over from existing files a script will make you go quick. Try exploring along these directions.

gunjjoshi commented 3 months ago

Good idea @Pranavchiku. I would be exploring the development of this script too, by referring to some of the current scripts that we currently use, such as evalpoly, evalrational, etc. Though these are not on that level which we would be requiring, but I believe references from these can be taken initially, and then we can build on top of that. I'll modify the proposal accordingly.

Pranavchiku commented 3 months ago

You need not to do last minute changes in your proposal, it is fine if you don't have it, just keep working on these things in parallel to boost up your progress.

kgryte commented 3 months ago

I think Pranav is referring to your own local development workflow. For example, when I work on new packages, I typically copy-and-paste an existing package which is similar to what I want (e.g., has C benchmarks, is a native add-on, etc), and then modify accordingly. Other folks may have different workflows. E.g., using a command from the terminal which copies over specific files but then scaffolds others. The general gist is that you can spend some time figuring out your preferred approach to authoring packages.

gunjjoshi commented 3 months ago

Got it @kgryte @Pranavchiku. I will be figuring out what works best for me, as there would be a large number of packages, so definitely spending some time to figure out this will be fruitful.

Also, I've submitted my final proposal, thanks a lot for the suggestions helping out !