Closed Daniel777y closed 6 months ago
Hello, @kgryte @Planeshifter @Pranavchiku.
This draft proposal comes a bit late but I'm eager for the chance to contribute to this community and learn from its exceptional members. Any feedback or suggestions you might have would be greatly appreciated. Thank you.
Thanks @Daniel777y for your proposal and desire to contribute to stdlib
!
This is definitely an area where it would be good to make progress. A few comments and suggestions for strengthening the proposal:
fs/promise
, although we currently lack Promise-APIs fs
APIs and studied packages such as fs-extra
; any functions from there or other missing ones that would should be included in stdlib's fs
namespace?Thanks for working on this proposal. One follow-up question I have is
fs.cp
was added in v16.7.0
. One of the core principles of stdlib is ensuring backward compatibility. Historically, we've supported all the way back to Node.js v0.10
. While we could potentially relax this requirement, how do you plan to support stdlib APIs which can accommodate and smooth over the differences across Node.js versions?Ideally, any fs
API we provide should work across all versions of Node.js that we support. And thus, for Node.js versions with missing functionality, we'd need to provide polyfills. And this could potentially be quite involved, and, if so, could affect your project timeline.
@kgryte Yes, thanks for response. As you mentioned, I do need to consider the backward compatibility.
I walked through the implementation of readable-stream
and tried to understand how it works. For example, the isReadable
function. What it does is like:
var isReadable = require('stream').isReadable || require('readable-stream').isReadable;
That is, if the native stream
has the isReadable
function or the readable-stream
is disable for some reasons, we will use it from stream
; otherwise, use the one from readable-stream
. So I might have to implement some functionalities manually, such as cp
, instead of including it from native fs
.
In is isReadable
case, its implementations is:
function isReadable(stream) {
if (stream && stream[kIsReadable] != null) return stream[kIsReadable]
if (typeof (stream === null || stream === undefined ? undefined : stream.readable) !== 'boolean') return null
if (isDestroyed(stream)) return false
return isReadableNodeStream(stream) && stream.readable && !isReadableFinished(stream)
}
isDestroyed
, isReadableNodeStream
, and isReadableFinished
are manually-implemented
utils functions as well.
Do you think this idea is enough resolve the backward compatibility issue in stdlib?
As for the complexity, I think I can extend my timeline to 16 weeks to ensure I have enough time to learn and implement the features.
I guessed that the graceful-fs
also provides polyfill across different versions, but unfortunately it seems not.
Yes, potentially. readable-stream
is a similar idea, but arguably more complex than is necessary. I would anticipate needing to manually implement in a number of cases. For those APIs having many options, could be a bit of a slog to polyfill and ensure adequate testing.
@Planeshifter Thanks for your feedback!
Yes, I also walked through graceful-fs
and fs-extra
to see how they implement fs
functionalities, while some other modules, like fs-minipass
and chokidar
, also provide references for specific features in fs
. I will discuss in detail with mentors about priorities and decide what extra features to implement and the order of implementation.
As for Promise-APIs, indeed it would be good to provide mordern Promise style in stdlib
, while I noticed that currently stdlib
provides sync/async APIs. If Promise-APIs are needed, for those support Promise, I think they can be implemented like sync/async APIs. For those not support Promise, I can implement async first, then "universalify" them to Promise style, like what fs-extra
and universalify
do.
As an example, for the rename
function, I can do something like:
var rename = require( '@stdlib/fs/rename' );
function universalify( rename ) {
return Object.defineProperty( function ( ...args ) {
if ( typeof args[args.length - 1] === 'function' ) {
rename.apply(this, args);
} else {
return new Promise( ( resolve, reject ) => {
args.push( (err, res ) => ( err != null ) ? reject( err ) : resolve( res ) );
rename.apply( this, args );
} )
}
}, 'name', { value: fn.name } )
}
var universalRename = universalify( rename );
That is, when users rename a file asynchronously, if they pass a callback function, it will use the callback; otherwise, it will return a Promise:
universalRename( './beep/boop.txt', './beep/foo.txt', done );
// or
universalRename( './beep/boop.txt', './beep/foo.txt' ).then( done );
This is a general idea, and do you think this is good enough for Promise-APIs? I can universalify existing APIs in stdlib
to Promise style in the coming days to give it a try. If the workload of Promise-APIs are potentially time-consuming, maybe we can divide them into sub-projects, and I would love to continue working on them after GSoC.
This potencial risks or obstacles are about the scope of work and timeline. Like mentioned previously, I will need to implement some utility functions to ensure the backward compatibility, handle functionalities with options, and try to provide Promise-APIs, so I need to balance priorities and difficulty of implementation. But the bright side is that there are many references and examples in other packages, and I am also free to extend the timeline to 16 weeks to ensure I have enough time to learn and implement the features.
As for correctness and performance, every implementation will be tested through Tape framework and benchmarked in TAP format. All the functionalities will provide concise error messages, and polyfill across older Node.js versions. Another concern is that, one emphasis of stdlib
is scientific computing, so I suppose it would process large or multiple files. Do you think it is necessary to handle some edge cases to avoid potential crashes?
I'd advocate for providing dedicated promise APIs and not the pattern of "if no callback, return a promise". That would fundamentally change error handling for legitimate use cases where a user intentionally does not provide a callback.
In general, I'd focus first on callback APIs. Then move to promise APIs (e.g. @stdlib/fs/promise/*
). For the promise APIs, the main prerequisite is that we need to create @stdlib/promise/ctor
with a polyfill fallback for older environments not having native Promise
support.
@kgryte Thank you very much for your suggestions! This way the stdlib
code will be easier to reuse and maintain. I suppose @stdlib/promise/*
can potentially occupy some time slots, but I'd love to give it a try. I found promise-polyfill for reference.
Full name
Dexu (Daniel) Yu
University status
Yes
University name
Northeastern University
University program
Computer Science
Expected graduation
2025
Short biography
I am currently pursuing a Master's degree in Computer Science at the Oakland campus of Northeastern University, having previously earned a Bachelor's degree in Software Engineering. My technical skill set encompasses programming languages such as C/C++, JavaScript, and Python. Additionally, I possess a strong foundation in Docker, Linux, MySQL, and Firebase.
During my undergraduate and graduate, I took various courses in the field of computer science, such as Data Structures, Operating System, Programming Design Paradigm, Web Development, and Software Test.
My passion for problem-solving has drawn me to competitive programming, where I have honed my abilities in algorithms and optimization. Moreover, I feel a great sense of achievement in developing personal applications and working in group projects, which allow me to bring my innovative ideas to life.
Timezone
US Pacific Time
Contact details
GitHub: Daniel777y, e-mail: yu.dex@northeastern.edu, DanielYu3790@gmail.com
Platform
Mac
Editor
Vim and tmux are my favorite choices for coding. The best thing about Vim is that I can start coding on any computer with just a few settings. Once I got the hang of its shortcuts and how it works, I found I could code much more efficient. When I'm working on bigger projects, I also use VSCode because it has a lot of plugins and helps me manage files better, making everything smoother.
Programming experience
My programming experience includes competitive programming, personal and group projects, and so on, covering a wide range of technologies, such as React.js, Vite.js, Vue.js, Node.js, Django, MySQL, C/C++, Bootstrap, Tailwind CSS. Here're some of my recent projects:
FlashFingers: A web typing game | React.js, Vite, Node.js, firebase
DevDeck: A tech stack selector of a project management tool | React.js, Vite, Node.js, firebase
BeiBei Words: A program for memorizing English words | Python, Uni-app, Vue.js
WuMo Drawing: A mobile game for children to write and draw | JavaScript, Cocos Engine
JavaScript experience
I use JavaScript intensively to develop various full-stack applications, for course work as well as personal projects. Recently, I also started contributing to
stdlib
by implementing themath/base/tools/normhermitepolyf
package for evaluating a normalized Hermite polynomial using single-precision floating-point arithmetic.I really appreciate JavaScript's ease of learning, flexibility and widespread adoption. It allows me to organize code in traditional OOP style or functional programming. Moreover, it streamlines and simplifies the development process significantly. For instance, I can craft user interfaces with React.js or Vue.js, and develop server-side applications with Express.js.
However, one limitation of JavaScript is its performance in computation-intensive tasks. While it's a popular choice for web development, for tasks requiring heavy computation, such as data analysis or machine learning, developers might prefer Python or R. Imagine if we can do such tasks on a browser, that would be exciting!
Node.js experience
In my full-stack projects, I usually utilize Node.js and Express.js for backend development. This includes tasks like database connectivity, API implementation, and file management among others.
C/Fortran experience
I learned C in my freshman year of undergraduate, and I have applied it in multiple course projects, including developing a library management tool and file system in Linux. Beyond these applications, I have been using C/C++ in competitive programming contests for over five years, which has helped me build a strong foundation in this language.
Interest in stdlib
When I first delved into competitive programming, the majority of participants favored C/C++ and Java. However, in recent years, more and more people start using Python, particularly with libraries like NumPy for computational tasks, while the use of Java has significantly dwindled and even C/C++'s dominance has seen a decline.
As I've mentioned, JavaScript's popularity in web development is undeniable, offering ease of implementation for ideas and product demonstrations. However, for data retrieval and analysis, developers still turn to alternative languages. Can JavaScript go further?
Stdlib
attracts me because it is enhancing JavaScript's flexibility and capability, such as numerical and scientific computation and other functionality. This expansion not only broadens JavaScript's applicability but also shows the potential for intensive computing tasks to be executed directly in the browser. Especially with the rising importance of machine learning and data science, I believe there will be more and more innovative applications built with JavaScript, necessitating robust libraries likestdlib
to support these advancements.Version control
Yes
Contributions to stdlib
Pull Request
Issue
My first contribution is implementing the single precision equivalent for
math/base/tools/normhermitepoly
.Though not aligned with the project I am proposing, this experience has given me a good understanding of the community's standards and the development process.
Also, The task of reimplementing single-precision functions shares similarities with the work involved in implementing the fs module, as both tasks are guided by a related overarching approach.
Goals
The primary goal of this project is to achieve complete feature parity with the Node.js fs module, thereby providing users with a full set of file system operations within
stdlib
.Additionally, this project will enhances compatibility with older versions of Node.js. Therefore, developers, even if they use older Node.js versions, can access and utilize new file management features through
stdlib
.Moreover, I will implement some of
Promise-APIs
for these functionalities, which will be beneficial for developers who prefer using Promises over callbacks.The successful implementation of this project is expected to significantly enhance
stdlib
's flexibility and utility.Functionality
Here're some of functionalities I am planning to implement (asynchronous versions):
mkdir
:mkdtemp
:rmdir
:opendir
:cp
:copyFile
:rm
:move
:ensureFile
:ensureDir
:emptyDir
:access
:stat
:utimes
:chmod
:chown
:link
: creates a link between two paths.readlink
:read
:write
:appendFile
:truncate
:createReadStream
:createWriteStream
:watch
:watchFile
:unwatchFile
:constants
: returns an object in which contains commonly used constants for file system operations.Besides asynchronous functions, I will also implement synchronous versions of them.
Other than these functionalities, I will also implement utility functions to polyfill the older versions of Node.js.
Compatibility
Currently,
stdlib
is compatible with Node.jsv0.10
and above. Therefore, to maintain compatibility with older versions of Node.js, I need to provide polyfills for some functionalities. To do this, I can borrow ideas fromreadable-stream
. Here's an general example:In this case, I might need to manually implement some helper functions and handle the various options.
Promise
stdlib
also plans to provide Promise-APIs in the long run. Therefore, I will try to also implement Promise versions of these functionalities. But I will first focus on the callback and synchronous versions, then move to@stdlib/fs/promise/*
later. I need to polyfill the older versions not supporting nativePromise
as well. To do this, I can borrow ideas frompromise-polyfill
and implement the@stdlib/promise
.Performance
To ensure correctness and performance, every implementation will be tested through
Tape
framework and benchmarked inTAP
format.All the functionalities will provide concise error messages, and handle potiential edge cases, such as invalid path, permission denied, and so on.
Documentation and Examples
I will adhere to the development guidelines, offering comprehensive examples and documentation for each function to help users understand their usage and support developers in code maintenance.
Why this project?
File management is a core operation for developers, and
stdlib
focuses on numerical and scientific computing, making file system crucial for handling data files. By contributing to this project, I will be enhancing the capabilities for high-performance applications that run in browsers withstdlib
, which I find immensely exciting.Additionally, this experience will deepen my understanding of JavaScript and Node.js. While I have previously worked with file systems within various frameworks, engaging with this project will provide me with a profound comprehension the mechanics of file system and other functionalities.
As a freshman in open-source, this project presents a valuable opportunity to contribute meaningfully to the real world and to learn best practices in software development. Working on the
math/base/tools/normhermitepolyf
was an eye-opening experience for me. I was particularly struck by the the detailed coding standards, the structured development cycles, and the thorough testing norms. It's exciting to anticipate a long-term involvement with this community.Finally, being part of
stdlib
's expanding community of both contributors and users is motivational and an honor. I am eager to make a significant impact on the development of excellent applications and to collaborate with dedicated mentors and fellow contributors.Qualifications
Programming Skill: With over five years of experience in competitive programming primarily in C/C++, coupled with various projects in JavaScript, Python, Node.js, I have strong knowledge of these technologies and understand the requirements and scope of this project. My experiences have equipped me with a robust grasp of algorithms and data structures, as well as the capability to implement complex programs efficiently.
Problem-Solving Skills and Self-Driven Attitude: I enjoy challenging difficulties and remain motivated even when faced with obstacles. I will leverage online resources, learn to use advanced tools, and consult experts when needed to find solutions.
Collaboration and Knowledge Sharing: I used to write blogs and offer free lessons on algorithms, sharing my knowledge with others. Additionally, my active participation in team projects reflects my cooperative nature and openness to collaboration.
Prior Contributions to
stdlib
: My prior contribution tostdlib
, specifically the addition of themath/base/tools/normhermitepolyf
package, has made me well-acquainted with the community's standards. This experience has paved the way for a smooth transition into working on this project.Prior art
In this project, I will implement
fs
package features in Node.js , so I'll mainly use the Node.js source code as a guide.Here're some extra packages I might refer to:
graceful-fs
andfs-extra
: gives a replacement for the nativefs
module.readable-stream
: gives an example on polyfilling.fs-minipass
: provides implementations of streams.chokidar
: to watch file changes.promise-polyfill
: to polyfill promises.Additionally, I might borrow approaches from file systems in other languages, like Python's
PyFilesystem
.Commitment
My semester ends on April 29th, so I’ll be completely free to work on this project from May to August. Since I don't have any other occupations, I can put in more than 30 hours each week during the summer. After the GSoC program ends, I plan to keep contributing in the community for about 10 hours a week.
Schedule
Assuming a 12 week schedule with extra 4 weeks.
Community Bonding Period: In this phase, I will deepen my familiarity with
stdlib
, including the norms and source code while starting working on related issues. In addition, I will discuss with mentors and refer other resources to improve my project plan and determine the scope of work. For example, I will try to find out which functionalities are needed to polyfill and the best approach to implement them.Week 1 to Week 3: I mainly work on basic file and directory management functionalities and open PRs for
mkdir
,mkdtemp
,rmdir
,ensureFile
,ensureDir
,emptyDir
,cp
.Week 4: This week, I'll deal with any backlogs and seek feedback from mentors to evaluate the quality of my work. Additionally, I may revise the work plan for the upcoming phase based on feedback and my progress.
Week 5 to Week 7: I will start developing file manupulation functionalities and open PRs for
copyFile
,rm
,move
,read
,write
,appendFile
,truncate
,rm
. Week 6 marks the midterm evaluation period. By this time, I aim to have completed half of the planned functionalities. This will also be an opportune moment to collect feedback from my mentors.Week 8 to Week 9: During this period, I would be opening PRs and implementing metadata and permissions, such as
constants
,access
,stat
,utimes
,chomod
,chown
. Also, I will handle any backlogs.Week 10 to Week 11: By this time, my attention will turn to link and stream functionalities, such as
link
,readlink
,createReadStream
,createWriteStream
. Moreover, I will discuss with mentors to ensure the quality of my work and plan for the final phase.Week 12 to Week 13: I will focus on watching functionalities, such as
watch
,watchFile
,unwatchFile
, and starting working on@stdlib/promise
.Week 14 to Week 16: In this final phase, I will try to implement
Promise
versions offs
functions. Lastly, I will test and review all of my work, and handle any final tasks.Post GsoC: After completing this project, I plan to continue contributing to the
stdlib
community, continue working on implementing thePromise-APIs
and support of other packages.Notes:
Potential risks
This potencial risks or obstacles are about the scope of work and timeline. Like mentioned previously, I will need to implement some utility functions to ensure the backward compatibility, handle functionalities with options, and try to provide Promise-APIs, so I need to balance priorities and difficulty of implementation. But the bright side is that there are many references and examples in other packages, and I am also free to extend the timeline to 16 weeks to ensure I have enough time to learn and implement the features.
Related issues
#10 [Idea]: achieve feature parity with builtin Node.js fs module
Checklist
[RFC]:
and succinctly describes your proposal.