stdlib-js / google-summer-of-code

Google Summer of Code resources.
https://github.com/stdlib-js/stdlib
23 stars 5 forks source link

[RFC]: add BLAS bindings and implementations for linear algebra #62

Closed AuenKr closed 2 months ago

AuenKr commented 3 months ago

Full name

Golden Kumar

University status

Yes

University name

Maulana Azad National Institute of Technology Bhopal

University program

Bachelor of Technology in Electrical Engineering

Expected graduation

July 2025

Short biography

I am Golden Kumar, a pre-final year undergraduate pursuing a B.Tech in Electrical Engineering student at Maulana Azad National Institute of Technology, Bhopal, India.

I am interested in open-source contributions, building web applications end-to-end, and problem-solving.

Timezone

Indian Standard Time ( IST ), (GMT+ 5:30)

Contact details

Email : auenkumar64@gmail.com ; 211113224@stu.manit.ac.in, Phone Number: +91 7002979846, Github: https://github.com/AuenKr, Twitter (X): https://twitter.com/auenkr

Platform

Linux

Editor

My primary operating system is Ubuntu 22.04 and my preferred code editor is Visual Studio Code (VSCode) is beginner-friendly as compared to command-line editors like Vim, and Neovim. It also has rich community support.

I use Neovim when editing a particular file from the terminal.

Programming experience

I have been coding for around 3 years and have gained expertise as a Full Stack Developer. I like to follow project-based learning while learning new technology.

I have a strong foundation in Mathematics, particularly in Calculus and Linear Algebra along with my technical skills mentioned below.

Programming Languages: JavaScript, TypeScript, Python, C/C++. Databases: MongoDB, PostgreSQL. Libraries and Frameworks: React, Nextjs, Recoil, Bootstrap, TailwindCSS, Express.js, Hono, Mongoose, Prisma, Streamlit. Tools and Platforms: Git, Github, Docker, Postman, AWS, Cloudflare, Vercel, Turborepo, MATLAB.

Projects:

1. Blogger :

2. PDFusion :

JavaScript experience

I began learning JavaScript while studying web development during my second year of college, and I have not looked back since.

The most thrilling aspect of JavaScript is its robust community support and extensive library support. My favorite feature of JavaScript is its versatility and flexibility. JavaScript can be utilized for both front-end and back-end development, even for developing command line tools, VScode extensions, etc. Furthermore, its support for asynchronous programming via features such as Promises and async/await makes it efficient for handling tasks that require waiting for I/O operations or fetching data from servers.

While JavaScript is a powerful language, it does have some quirks that can be frustrating at times. One of the features I find least favorable is its type coercion, where values are automatically converted to another type during operations. This can lead to unexpected behavior and bugs if not handled carefully. TypeScript solves the coercion problem at development time by providing static type checking but still occurs at runtime.

Node.js experience

I used Node.js for the backend mainly for building APIs for my web application. I have mainly used Express.js/Hono for routing and middleware management, cors, jsonwebtoken, and file handling. I've also integrated with databases like MongoDB, and PostgreSQL, utilizing ORMs like Mongoose, and Prisma. to build my web application's APIs

C/Fortran experience

My first programming language was C, also present as a subject in my college course. I also learn data and structure in C++ language. This provides has provided me with a strong foundation in programming principles, data structures, and algorithms. I haven't actively coded in Fortran, just saw the syntax in stdlib codebase. From the foundation I have gained while learning C/C++, I believe in my ability to learn Fortran as it is required for this purposal project.

Interest in stdlib

Stdlib provides a way to use complex mathematical and statistical functions directly, plotting, and graphics functionality for data visualization and exploratory data analysis. I also provide utilities for application and library development. Functions to assert, group, filter, map, pluck, and transform your data both in browsers and on the server.

Not every programmer has the technical knowledge to run complex mathematical, and statistical functions via code. This issue can be solved via stdlib library which has vast potential.

It also directly runs NumPy and SciPy functionalities straight into JavaScript, in the browsers.

Version control

Yes

Contributions to stdlib

Merged Pull requests (10):

refactor: update blas/ext/base/dnansum to follow current project conventions

refactor: update blas/ext/base/sapxsumkbn to follow current project conventions

refactor: update blas/ext/base/sdsapxsum to follow current project conventions

refactor: update blas/ext/base/dsapxsum to follow current project conventions

refactor: update blas/ext/base/dcusum to follow current project conventions

refactor: update blas/ext/base/scusum to follow current project conventions

refactor: update blas/ext/base/dssum to follow current projects conventions

feat: add string/base/replace-after-last

feat: add string/base/replace-before-last

feat: add string/base/replace-after

Open Pull Requests (2):

refactor: update blas/ext/base/dnannsum to follow current project conventions

feat: add array/base/mskfilter-map

Goals

The proposed project aims to extend the capabilities of stdlib by implementing BLAS (Basic Linear Algebra Subprograms) routines in JavaScript. BLAS routines are fundamental for performing vector and matrix operations and are widely utilized across various numerical programming languages and libraries such as NumPy, SciPy, MATLAB, and R.

BLAS routines are categorized into three levels:

List of BLAS routines: https://www.netlib.org/blas/

image

Each package implementation goes through four phases:

Upon completion, users will be able to call BLAS routines from JavaScript. In web browsers, BLAS routines will be in JavaScript. In Node.js, provided native bindings have been compiled, BLAS routines will be ported to reference implementations or hardware-optimized system libraries.

Why this project?

Stdlib provides a way to use complex mathematical and statistical functions directly, plotting, in the browsers. It also provides similar functionality to NumPy and SciPy directly in javascript which can be directly used in the browser.

I have used BLAS packages in MATLAB while working on my electrical project. And getting a chance to understand how this package works under the hood will be great. As existing base BLAS packages need to be updated according to current project conventions before continue building out additional BLAS functionality. This will be crucial for the further development of the BLAS package in Stdlib and solving this problem to enhance this excites me the most.

Qualifications

As C/C++ was my first programming language. Also, I have learned data structure and algorithms in C++ which further enhances my knowledge of C++.

I have been working in Javascript and Node.js since my second year of college. During this, I built many web application projects as mentioned earlier which helped me to develop a strong understanding of Javascript and Node.js. My skill set includes a solid proficiency in JavaScript and Node.js, complemented by practical experience in backend technologies. This familiarity extends to networking, APIs, and code optimization, among other essential aspects of backend development.

Furthermore, my involvement in stdlib has deepened my proficiency in C, JavaScript, and Node.js. Considering these contributions alongside my existing experience, I am confident in possessing the technical expertise necessary for this project.

Prior art

For this project, some of the work has already been started. [RFC]: Add BLAS bindings and implementations for linear algebra (tracking issue)

Reference material

Commitment

My College’s summer vacations are scheduled from May to mid-July. Therefore, there will be no classes. I can dedicate around 40 hours a week. My college will reopen at the end of July. During this time, I could work more than 20 hours a week. I can work during the period as follows: 27 May 2024 - 12 July 2024 (6 Weeks): 40hrs/week 13 July 2024 - 26 August 2024 (6 Weeks): 20hrs/week

Total duration = 360 hr This would fit within the 350-hour category.

Schedule

Assuming a 12 weeks schedule: Implementation Status for BLAS Routines,

Additionally, I plan to submit Pull Requests following the implementation of each package. This approach will prevent the accumulation of a large amount of code for review at the final stage, thereby easing the burden on the reviewing process.

Post GSoC: I would like to continue contributing to stdlib even after the completion of GSoC, working on more projects.

Notes:

Related issues

Progress Tracker

Checklist

Pranavchiku commented 3 months ago

Hi @AuenKr, thanks for your proposal! I feel you should write more about how are you going to tackle / implement these blas functions, we all know that these ones are left, you may also given information about what are levels in blas. This can be much more detailed, please incorporate it and this will be good.

AuenKr commented 3 months ago

@Pranavchiku, thank you for reviewing and providing feedback. I will add examples for implementing BLAS package for each level and also provide descriptions of each BLAS level

AuenKr commented 3 months ago

@Pranavchiku, I've made updates to the proposal based on your suggestions. Could you please review it again?

steff456 commented 3 months ago

@AuenKr, thanks for your sharing your draft proposal!

I feel it is really well structured and clear. My only comment for you is that in your schedule you didn't leave room for just wrapping up the opened PRs during the last two weeks. Please remember that there's a review cycle that may be happening and it can take more than just one week, so it will be best if you account that time into the last weeks of your proposal.

Also, I can suggest you if you want during those last weeks to also add the possibility of writing a blog post on your experience. This is completely optional so please don't feel pressured to add it!

kgryte commented 3 months ago

Thanks, @AuenKr, for sharing a draft proposal. A few comments:

  1. Your order goes real single -> real double -> complex single -> complex double. I would suggest otherwise. Namely, I would focus on real-valued double-precision, followed by real-valued single-precision, then double-precision complex, and finally single-precision complex. Real-valued double-precision is going to be the primary use case, so getting all d* interfaces completed should be highest priority.
  2. We also include g* interfaces. Some background. We have Fortran implementations in order to ensure high fidelity to the original reference netlib BLAS library. We include a C port due to spotty Fortran compiler support on Windows and to support WebAssembly. We include a JavaScript port in order to run in web browsers and to serve as a fallback in Node.js in case the native add-on has not compiled. The g* interfaces provide a dtype agnostic implementation, which is necessary to avoid manual data copies when operating on, say, an Int32Array. As the C interfaces are typed (e.g., expecting double-precision), we cannot simply pass down an int32 array. Hence, we provide a JavaScript-only package for this use case.
  3. Adding "main" (i.e., non-base) packages may be a bit of a stretch, especially as your proposed plan already suggests opening over 10 PRs per week. Each of those PRs will require review. And some of those PRs will likely implement functionality which is a prerequisite for other functionality (e.g., certain Level 1 routines will be prerequisites for Level 2 routines, and so on).
  4. It may be worth investigating the implementations of various routines to have a better understanding of the dependency graph. If certain BLAS routines are commonly used by other BLAS routines, it would be good to prioritize those over others.