Closed aman-095 closed 2 months ago
Thank you for sharing a draft of your proposal, @aman-095. This looks good to me. Feel free to go ahead and submit on the GSoC website.
The only small comment I'll make is that you'll want to consider the order in which BLAS packages are added, prioritizing real-valued over complex and double over single precision.
And further there are the "g*" variants but these should be straightforward to implement once the double-precision package is implemented.
Thanks for the review, @kgryte. I will prioritize this and have also added this to my proposal.
Full name
Aman Bhansali
University status
Yes
University name
Indian Institute of Technology, Jodhpur
University program
Mechanical Engineering
Expected graduation
May, 2024
Short biography
I am a final-year undergraduate student in the Department of Mechanical Engineering at the Indian Institute of Technology, Jodhpur. I have completed the following relevant courses in my academic curriculum: Introduction to Computer Science, Data Structures and Algorithms, Linear Algebra and Calculus, and Scientific Computing. I have a strong background in
Python 3.x
,C
,C++11
, andJavascript
developed during projects and courses. The applications of machine learning pique my curiosity. I have a research paper published at the European Conference on Artificial Intelligence (ECAI-23) and several projects employing computer vision and natural language processing.Timezone
IST (UTC + 5:30)
Contact details
email:amanbhansali65@gmail.com,bhansali.1@iitj.ac.in,github:@aman-095
Platform
Windows
Editor
I utilize Windows Subsystem for Linux as it allows me to run a Linux environment on my Windows machine. I use VSCode for my code editor. I’ve been using this since the first time I started programming. The reason at that time was that it supported multiple languages, user-friendly interface, and had a lot of extensions for each programming language. Debugging the code is easier on VSCode. Now that I have started contributing to open source, the reason is its EditorConfig extension and the support of Github Copilot, which make it easy and fast to use for getting jobs done.
Programming experience
Since my first year of undergraduate studies, programming has been a pivotal part of my academic journey. I started with Python through my course on Introduction to Computer Science where I gained experience on how a language works, its syntax, and its structure after which I shifted to C++ delving deeper into its object-oriented paradigm. Further, my Data Structures and Algorithms course required C language and hence I learned C during that time and practiced solving problems. One of the invaluable lessons I’ve gained from my programming journey is the universality of core concepts across languages. While syntax may vary, the underlying principles of logic and problem-solving remain constant. This realization has not only enhanced my adaptability but also reinforced the importance of understanding the fundamentals of any language. As I have my multiple projects using python but when I shifted to other languages I felt that syntax varies rest the understanding, and logic remains the same. So, yes I have around 4 years of experience with programming and I feel confident to work on this project as well.lesson
JavaScript experience
Although I do not have any major projects that are solely inclined toward the utilization of
JavaScript
, I’ve gained an understanding of the language through an online course. Also, my knowledge and experience of programming languages, such asC/C++
andPython
, has provided me with a solid foundation that helped me seamlessly translate toJavaScript
. My contributions to stdlib have been pivotal in further enhancing my knowledge and practical application of the language. I have gained an understanding of how code flows which I feel is the most important. Contributing to stdlib, I was able to grasp the structure fast, and hence my contributions also signify that. This experience portrays my ability to adapt faster, even when faced with unfamiliar concepts or challenges. I am confident that any gaps in my knowledge ofJavaScript
I will be able to bridge, and this will not be a blocker for the project.Node.js experience
The main source of my
Node.js
experience is the stdlib contributions that I have made. I think that my prior knowledge offers a solid basis for fundamentals and hence picking up the language fast. My understanding ofJavaScript
development inside theNode.js
ecosystem has been enhanced by my contributions to several packages and my implementation of three BLAS Level 1 routines.C/Fortran experience
I do not have any prior experience working with Fortran but I have a good experience with C language as mentioned above. Recently while working on
stdlib/blas/base/drot
,stdlib/blas/base/srot
, andstdlib/blas/base/zcopy
BLAS routines, I had to write C implementation from reference Fortran77 code which seemed easy to me. Looking at the reference implementation of other BLAS routines, I do not think less experience with Fortran may serve as any blocker to this project.Interest in stdlib
Numerical computations and fundamental linear algebra have always excited me and working with stdlib has provided me an opportunity to pursue them and work for the open source community. This project offers me a chance to explore more of these operations, and their applications in various scientific and engineering domains. Working here will allow me to contribute to a widely used and respected library within the JavaScript ecosystem. After contributing to stdlib, I attest to the punctiliousness that goes into reviewing each pull request (PR), ensuring that the entire codebase appears standardized and that the use of multiple files never gives the impression that different contributors implemented them. Likewise, I strive for perfection in my endeavors, therefore, this also influences and motivates me. Being part of the stdlib community, I believe, means being part of a vibrant ecosystem of developers and contributors. The collaborative nature of open-source development promotes the true spirit of learning, innovating, and sharing, which I find to be highly rewarding and motivating.
Version control
Yes
Contributions to stdlib
Following is the list of Pull Requests that I have made in the past to the GitHub repository of stdlib, PS: I have listed contributions from most recent to least recent, list order does not determine the difficulty of the PR.
Merged:
Open:
A comprehensive list of all my merge requests can be found at:
Goals
In this section I would be discussing the project in detail. Specifically, I will be listing down the abstract, outcomes of the project, project description, current status of work, and approach.
1. Abstract:
This proposal is directed toward adding BLAS (Basic Linear Algebra Subprograms) routines and their C / Fortran / Js implementation for linear algebra. In linear algebra, BLAS routines are predominantly employed to perform operations on matrices and vectors. BLAS are commonly utilized in the development of advanced linear algebra software owing to their efficiency, portability, and extensive accessibility. The major goals associated with the project are:
2. Project Outcome
Once we have the BLAS packages they can be used for vector and matrix operations and can be called using APIs i.e. a comprehensive suite of BLAS routine APIs will be incorporated into stdlib. Users will be able to call BLAS routines from
JavaScript
. In web browsers, BLAS routines will be inJavaScript
. Additionally, the integration ofC
andFortran
implementations will aim at optimizing performance and enabling potential hardware-based implementations.3. Project Description
There are three levels of BLAS routines:
Within the hierarchy of BLAS routines, there exist distinct iterations of the same routine distinguished by their data types, denoted through specific conventions. For instance, variations such as SROT, DROT, CSROT, and ZDROT adhere to this convention. Here, the prefixes ’S’, ’D’, ’C’, and ’Z’ respectively signify real single precision, real double precision, complex single precision, and complex double precision. In stdlib we also have 'G' type which is the generic type implementation for each routine. Current Status: I am tracking the status of the implementations via an issue:
4. Approach:
First, I will start with the implementation of Level 1 routines and then move towards Levels 2 and 3. In each level, the priority will be first real, followed by complex datatypes, and further inside real for real double precision, followed by real single precision, and similarly for complex double and single precision. Once these are done I will also implement the 'G' type for each routine. I propose to split the implementation of the complete package into two Pull Requests (PRs) for efficiency and clarity. The first PR would contain the Js implementation of the Fortran code from the reference mentioned above and other required files. Then the second PR will contain the C / Fortran implementation. After a few implementations of routines, I will then use one Pull Request for the entire routine.
The real challenge while implementing a package from scratch lies in creating a comprehensive test suite and standard benchmarks. Javascript being a weakly typed programming language makes it easy to write up the core implementation, and then the implementation can be tested against registered tests and benchmarks.
On the other hand, C and Fortran code demands strict variable types and is prone to error, so once we have a thorough test suite merged via the first PR, we can test the new implementation and be assured that we are on the right track.
Below, I will outline the detailed implementation plan for the ISAMAX routine within the Level 1 BLAS library:
Routine Description:
Based on the notations above discussed, ISAMAX is a single-precision routine. It finds the index of the element having the first maximum absolute value inside the real single precision array. Parameters associated:
Routine Signature
The above code snippet is the Fortran95 definition of the function that I will implement using the Fortran77 code. It has three parameters as discussed above.
The above code snippet is the function definition of the native Js implementation for ISAMAX.
In JavaScript, arrays are implemented as objects, and there is no direct access to memory pointers. An ndarray in JavaScript often utilizes a typed array as its underlying buffer. Say we have a typed array buffer with 100 elements and we want to apply any filter over this data instead of creating a new buffer for applying offset we have offset passed as a variable rather than using additional memory.
The above code snippet is the function definition for the ndarray case. It has four parameters, the new parameter here is the offsetX. By using an offset, we can pass a pointer to the memory location indicated by the offset, effectively passing only the relevant portion of the buffer to the C function without needing to re-instantiate a new object. This function will help us give users the freedom to set offset, which in the normal function is zero by default, i.e., it provides alternative indexing semantics.
The above code snipet is corresponding to the C signature of the Fortran77 implementation. The definition has a int return type for the index, the parameter X indicating that isamax is a single precision.
The above code snippet is the signature for cblas which refers to a C translation of the FORTRAN77 BLAS accessible through the LAPACK linear algebra library. It is only used when we are targeting hardware-optimized libraries such as OpenBlas & Apple accelerators.
Routine Implementation
Now we have two parts of the implementation as discussed for a few routines, first the js implementation of the Fortran77 code as first PR and then the C and Fortran implementation in the second PR. Now from this code, it is crucial to determine whether or not all the intrinsic functions utilized in the Fortran implementation are present in our library prior to working on the routine implementation. If not I will implement that first and then proceed with the work. Here we see that the Fortran code uses abs for which we have javascript and C implementations available thus this won’t be an issue.
Js Implementation
The primary files needed for the JavaScript implementation of the two APIs as previously discussed include:
lib: This folder will contain three files, first named isamax.js, containing the implementation for normal isamax with a predefined offset of 0, and will contain the code using the fortran code and the entire structure following standards based on the other stdlib packages. The other file named ndarray.js will have a function to provide alternative indexing semantics to the user, i.e., an extra parameter in the function for the offsetX indicating from where to start. The ndarray is used in cases where we have an array as an input parameter and we want to allow users additional flexibility in their offset. We have index.js which calls native.js which is added after C implementation is done. Then we will have the main.js file, which, based on which function is called, moves either to the isamax.js file or the ndarray.js file, and further operations will follow the code. This way, we have the implementation in Js ready for the isamax function.
The above code is for the native js implementation
The above code is the ndarray.js implementation.
This can be made for isamax; we can also do the same and get the tests. The other tests could be made based on different cases, i.e., negative strides, complex access patterns, etc., based on requirements. Similarly, tests are made for the Ndarray case. The third file is the test.js file, which checks if all the functions being called and used exist in the package. Similar to other files, the structure will be the same as that used in other stdlib files.
C / Fortran Implementation:
Firstly, I will outline the steps involved in crafting the C implementation, followed by the Fortran counterpart. Given our prior discussions, the focal point lies in the implementation code, and introducing tests at this juncture won’t pose any complications since we already have them prepared for the JS implementation. Furthermore, it is imperative to update our README file, documenting the C APIs and providing examples akin to what we’ve done for JS. This entails adhering to the standard structure and format prevalent in stdlib packages, a familiarity I’ve acquired through my past contributions. Here is the list of required files & folders:
Within the directory structure, there exists a folder labeled include/stdlib/blas/base/isamax housing header files such as isamax.h, isamax fortran.h, and isamax cblas.h. When necessitating the utilization of these functions, we incorporate the pertinent header file into our implementation, as will be exemplified in the src directory.
In the src directory, one can find both the C and Fortran implementations for the specified routine. It’s crucial to grasp that C / Fortran implementations are favored for their performance in mathematical operations. The purpose of adding both Fortran and C implementations is based on what the compiler capabilities are. Fortran will be used as long as host can compile Fortran. A lot of scientific computations performing libraries and legacy codebases such as lapack are written in Fortran and by providing a Fortran implementation, it can seamlessly integrate with existing Fortran-based software. Hence, our objective is to employ JavaScript as a wrapper to transmit parameters and utilize C for computation, similarly leveraging C as a wrapper to engage Fortran for computation.
We will proceed by creating the file named isamax.c, which will contain the C implementation for the BLAS routine isamax. This file will include the header file isamax.h, as previously discussed, along with any intrinsic functions necessary, such as here it is abs.
Another file to be created is isaamax cblas.c, where cblas refers to a C translation of the FORTRAN77 BLAS accessible through the LAPACK linear algebra library. This interface is utilized by devices with support for cblas, such as Apple accelerators, to transmit parameters to the cblas function.
Next, we have the Fortran implementation, comprising isamax.f and isamaxsub.f. In Fortran, parameters are typically passed as pointers, and the presence of a sub-function indicates that it is a subroutine, which doesn’t return any value.
isamax.f contains the free-form implementation of the Fortran code. On the other hand, the sub-file, isamaxsub.f, serves as a wrapper aiding in accessing results from C. This implies that isamaxsub.f will include an additional parameter for the maximum index. In Fortran, where everything is passed by reference, this sub-file calls isamax.f and stores the address of the index result.
Furthermore, there’s another file named isamax f.c. This function, in its implementation, introduces a variable for the answer and passes its reference or address to a sub-file. The sub-file employs Fortran code to obtain the result and then assigns it to the address provided by C. This approach is adopted to establish an interface enabling C to pass parameters while leveraging Fortran implementation for calculations, thereby enhancing performance.
This way, the entire BLAS for ISAMAX will be implemented, as will the other BLAS routines. An important point to note is that all the files should follow the standard convention of files used in stdlib, which I have gained a decent amount of understanding for while implementing the drot, sort, and zcopy BLAS routines.
Why this project?
The proposed project, presents an exciting opportunity to explore and understand the concepts of numerical computation, particularly focusing on fundamental linear algebra operations. The major factors behind my motivation for the project are:
Discussions with Athan and Pranav have further bolstered my confidence and motivation to embark on this project.
Qualifications
The requirements for this project are proficient knowledge of C, JS, and Fortran, knowledge of scientific computing, and understanding of the structure, format, and flow of stdlib files and code. Based on the aforementioned sections pertaining to my experience with C/Fortran and Js, I am confident that I possess the requisite skill set and knowledge of the technology framework necessary for this undertaking. I also took a course on Scientific Computations as I mentioned above which built my interest and knowledge in the domain. As a result of gaining a sufficient comprehension of how computations for these functions operate and the need for high-performance implementations through the completion of this course, I am qualified to begin working on this. Also, I am a quick learner and motivated to learn new things, furthermore, my contributions to stdlib have provided me with insight into the package’s architecture, and through a conversation with Athan, I realized why it is critical to achieve perfection in implementation and why the package’s structure as a whole is significant, not just the implementation. Thus, based on my merged PRs I feel that I am learning to do things the stdlib way.
Prior art
BLAS in a library is essential for efficient numerical computing. A library that makes use of BLAS can make use of highly optimized versions of these operations, which are frequently created and tuned for particular hardware architectures. Also, as I have worked in the domain of ML, the models behind them have a huge usage of linear algebra, and having this can even open doors further.
The traditional fortran77 implementation for BLAS routines is available at lapack-netlib further libraries such as OpenMathLib contain the modern implementation based on the lapack-netlib that helps in efficient and optimized computations. Numpy and Scipy, for example, are libraries that use BLAS to execute operations on matrices and vectors efficiently. For example, dot(), linalg.svd(), etc. are based on BLAS. In stdlib, we have implementations for BLAS Level 1 routines now, with a few of them still remaining. I am tracking the status of all the already-implemented BLAS routines and they will serve as a reference for the structure of the routines that I’ll implement further using the Fortran77 code.
Commitment
In the project goals, as I mentioned, there are a few BLAS routine functions for which the C and Fortran implementations are remaining. In the period before GSOC, I wish to complete these packages entirely. Also, I will try to implement one BLAS Level 2 routine during the GSOC period to gain an overall understanding of how vector and matrix computations work, which will help me during the GSOC period, and I will be able to reduce significant time in the implementation of BLAS Level 2 routines as well. I estimate that I will complete around 6–7 routines of Level 1 before GSOC. I am estimating that I will be easily able to devote around 15 hours per week before the GSOC period working in the direction as written above.
During the GSOC period, I wish to contribute approximately 30 hours per week to work on the project for around 350 hours. As I am a final-year student and I am placed, summer will entirely be free for me as of now. I don’t have any major plans, and hence I will aim to complete as many packages as I can during this period. In the Schedule section of my proposal, I have provided a detailed structure. After the GSOC period, I will not be able to devote the same amount of time as during the period, and since I find this exciting, I will work slowly and try to complete the remaining Level 3 or any remaining routine from another level if remaining after the GSOC period.remain
Schedule
Assuming a 12 week schedule,
Community Bonding Period: In this stage, I hope to engage with my mentors and fellow contributors, deepening my understanding of the project as we go. In order to lessen the blocker throughout the course of the 12 weeks, I’ll try to identify any intrinsic functions that are utilized in any BLAS routine but haven’t been implemented yet and work to get those done over this time. During this time, I also want to begin working on easy Level 1 routines like ISAMAX and the double, complex single, and complex double routines that go along with it. Aside from this, I will be happy to work on whatever other front my mentors ask me to, and I will do so.
Week 1 - Week 3: The sequence will mirror the one I specified in the objectives: commence with Level 1 BLAS routines, for which I anticipate spending approximately one day per Level 1 BLAS routine initially; this time will subsequently decrease as I acquire experience and accelerate. According to my estimation, I will have completed approximately 6-7 Level 1 routines before GSOC. Consequently, I anticipate effortlessly completing the remaining routines of Level 1 within three weeks. After each week concludes, I intend to publish blog posts on my website detailing the work completed, progress made, and a strategy for the following week.
Week 4 - Week 9: BLAS Level 2 routines will be the target for these weeks, it consists of approximately sixteen single-precision routines, followed by the same number of complex single, complex double, and double routines. Prior to GSOC, I anticipate finishing one Level 2 routine, which will serve as a point of reference for subsequent operations. I anticipate finishing all Level 2 products within six weeks. Additionally, if necessary, the midterm report will be due in week 6, and I intend to complete it in accordance with my progress.
Week 10: This week I will start working on Level 3 routines. Furthermore, while Level 2 routines may prove to be somewhat intricate and time-consuming, I anticipate having them completed by the conclusion of Week 10 if left will be completed this week.
Week 11 - Week 12: I will continue with the implementation of BLAS Level 3 and see how far I can go. As this is the final week, I will complete any outstanding pull requests associated with any routines implemented so far.
Final Week: This week I will try to implement the remaining Level 3 routines and additionally, if necessary work on my final report. I anticipate completing the entire Level 3 routines by the end of the final week.
Post-GSOC: After these 12 weeks, we may be left with a few BLAS Level 3 routines. I will continue to implement these routines gradually, taking into consideration the time available to me. It would be inappropriate to provide a comment at this time regarding the weekly hours I will be able to dedicate. Nonetheless, I am confident that I will be able to devote time to stdlib, contribute to its development, and complete the remaining tasks.
Notes:
Related issues
Tracking Issue:
Project Idea Issue:
Checklist
[RFC]:
and succinctly describes your proposal.