stdlib-js / google-summer-of-code

Google Summer of Code resources.
https://github.com/stdlib-js/stdlib
23 stars 5 forks source link

[RFC]: Make code blocks on website documentation interactive #60

Closed Shubh942 closed 2 months ago

Shubh942 commented 3 months ago

Full name

Shubh Mehta

University status

Yes

University name

IIIT Jabalpur

University program

Computer Science

Expected graduation

2025

Short biography

I am Shubh Mehta, currently pursuing my bachelor's degree in Computer Science at IIIT Jabalpur, currently in my prefinal year. My focus lies in cybersecurity research, where I actively engage in uncovering vulnerabilities within various multinational corporations' systems. Additionally, I specialize in developing websites and APIs with a strong emphasis on security, ensuring they are bug-free.

Proficient in languages such as C++, JavaScript, Python, Node.js, and React.js, I am currently expanding my skills in DevOps. My involvement in competitive programming has honed my problem-solving abilities, particularly in dealing with complex issues. In addition to practical experience, I've delved into theoretical foundations through coursework covering Data Structures and Algorithms.

My keen interest lies in frontend application maintenance, particularly in mitigating security threats like Cross-Site Scripting (XSS), which can potentially expose websites to exploitation by hackers. Familiarity with security aids me in undertaking the project Make code blocks on website documentation interactive. Working on this project enhances my knowledge in both frontend and security, ensuring that the blocks do not allow malicious code to render.

Timezone

India Standard Time

Contact details

email: shubhmehta942@gmail.com,shubhmehta837@gmail.com,21bcs197@iiitdmj.ac.in,github:Shubh942,Linkedin: https://www.linkedin.com/in/shubh-mehta197/

Platform

Linux

Editor

Visual Studio Code (VS Code) is my preferred code editor because it works seamlessly with Ubuntu and offers many useful features and plugins. It's easy to use, with a simple interface and powerful debugging tools, making it great for writing code in languages like JavaScript, C, and C++. It also integrates well with Git for version control and collaboration. Overall, VS Code makes coding easier and more efficient, whether I'm working on personal projects or professional software development.

Programming experience

Over the past three years, I've been deeply immersed in programming, delving into advanced concepts like data structures, object-oriented programming, and many advanced concepts of CS Fundamentals. With active engagement in competitive programming, I've attained a specialist rank on Codeforces, alongside quite active in LeetCode and CodeChef. My exploration of cybersecurity has been fruitful, uncovering vulnerabilities in prominent companies such as Swiggy, Indeed, and Boatzon, with reports promptly relayed to their security teams. Additionally, I've also participated in various hackathons.

I've developed numerous projects using libraries and frameworks such as React, Node, and Express. In these projects, I've prioritized security by conducting rigorous testing to ensure protection against vulnerabilities.

JavaScript experience

In the past three years, I've immersed myself in JavaScript, from learning its basics to mastering advanced asynchronous programming. Asynchronous programming is made simple with the async/await syntax. This allows for cleaner and more efficient code handling asynchronous tasks. I've also explored various JavaScript libraries like React, Node.js, and Express.js, each offering unique capabilities for innovation.

Personally, I find arrow functions in JavaScript incredibly convenient. Their concise syntax and implicit return simplify code structure, making them a preferred choice for handling callbacks.

Node.js experience

In my personal projects with Node.js, I've predominantly utilized the Express library to develop server-side applications, integrating various databases such as PostgreSQL, MySQL, and MongoDB. Throughout these projects, I've implemented numerous middleware for authentication and routing, ensuring robust security and efficient navigation within the applications. Additionally, I've incorporated authorization features to maintain administrative functionality, allowing for seamless management of user privileges. Moreover, I've developed a multitude of APIs and managed authentication processes to safeguard sensitive data and ensure secure access to resources.

C/Fortran experience

My proficiency in C is grounded in a strong foundation built through coursework and participation in competitive programming. I possess a deep understanding of low-level programming concepts, adeptness in memory management, proficiency in implementing data structures, and skill in algorithm implementation. From system-level programming to embedded systems development, I've successfully completed projects spanning a diverse range of domains.

In addition, I'm currently engaged in a project involving the integration of GDB into a user-friendly interface for debugging C code without relying on the terminal.

Interest in stdlib

Stdlib, with its extensive collection of math functions, provides developers with a rich toolkit that significantly simplifies coding tasks. Offering a wide range of tools, Stdlib proves invaluable for developers working on diverse projects, spanning various domains. For me, Stdlib represents the perfect blend of two passions: mathematics and web development. Contributing to such a prominent organization would not only be a great learning experience but also a rewarding opportunity to make a meaningful impact in the developer community.

Version control

Yes

Contributions to stdlib

So far in stdlib, I am still contributing with my following pull requests.

Goals

Project Idea

My project aims to enhance website documentation by making code blocks interactive. These interactive code blocks will allow users to edit the code and receive real-time annotations on the output. By enabling users to edit arrays and instantly review the response, the interactive code blocks will significantly improve the user experience. The proposed design includes mechanisms to track and respond to changes effectively.

image

Description

I'm flexible with adapting the design to meet specific requirements. It's essential to note that the use of the require function is restricted to the Node.js environment and cannot be utilized in a browser setting because web browsers do not have a built-in module system that directly supports require. Moreover, granting permission to execute entire code blocks poses security risks, as malicious users could inject harmful payloads into the site. To mitigate this risk, input sanitization measures are implemented to prevent the execution of code if it does not adhere to the expected format. This helps ensure that only safe and expected inputs are processed and executed within the code blocks.

Efficiently integrate standard library packages dynamically by selectively loading them as required within interactive code blocks. This approach optimizes performance and user experience by minimizing unnecessary resource loading through lazy integration.

Idea for running require Function

Bundling executable code enables the creation of versatile applications capable of accepting inputs in any format and producing desired outputs. This approach enhances usability across diverse environments, streamlining code execution for seamless functionality despite input variations. Consequently, it optimizes performance and enhances user experience.

For making code Blocks interactive

There are many options for code editor library which we can use to apply realtime annotation of the user.

  1. Ace Editor
  2. Code Mirror
  3. We also have the option to develop our own logic for creating the editor with annotations, eliminating the need for external libraries. This approach ensures independence from third-party dependencies, allowing us to tailor the editor precisely to our requirements.

Idea for security measures:

  1. Input Sanitization: Ensure that user input is properly sanitized before being rendered in code blocks. This involves removing or escaping any potentially malicious HTML, JavaScript, or other executable code.
  2. Input Validation: Validate user input to ensure that it adheres to expected formats and structures. Reject or sanitize input that does not meet validation criteria before rendering it in code blocks.
  3. Escape HTML Characters: The HTML characters such as < or > can create difficulty, which we can escape through the special function htmlspecialchars() for encoding.

Example of injecting payload

image

Our code returns the Nan value, as it sanitizes the input.

I had also created a prototype of working of my idea, It can be seen in the video provided below. Video Link: Link

I have also attached the GitHub link and hosted link of my gdb project in which I had integrated a code editor with folder structure Github link: Link Hosted link: Link

Implementation of Prototype:

Hosted Link: Link GitHub Link: Link

I've developed an implementation for executing code blocks directly in the browser. To achieve real-time annotations, I've integrated the logic from my GDB-Ui project, where a code editor has been implemented. I can use the logic of integrating the code editor in stdlib for real-time annotations.

When attempting to execute code directly in the browser, errors such as require is not defined occur due to the unavailability of the require function in the browser environment. To address this limitation, bundling tools like Webpack or Browserify can be employed to create a bundle that encapsulates the required functionality. By bundling the code into a single executable function, which can be named bundle.js, users can input their code and receive the corresponding output seamlessly within the browser environment. This approach allows for the execution of code in a browser-friendly manner, overcoming the limitations posed by the absence of the require function.

  1. Utilize bundling tools like webpack or browserify to encapsulate the required functionality into a single bundle.
  2. Create an executable function, such as bundle.js, which allows users to input their code and receive the corresponding output.
  3. Run the below command for creating bundled.js
    browserify bundle.js -o bundled.js

    By performing the above action we can create the bundled.js file which can be used in HTML for rendering purposes. Currently, I've handled all these tasks manually, but I'm actively engaged in research and studying articles to devise an automated solution.

Why this project?

Implementing this project requires a deep understanding of cybersecurity principles and the ability to tackle complex challenges. As someone deeply passionate about both security and code blocks, I'm excited about the technical hurdles involved in designing and implementing interactive code blocks.

With a background in computer science and software development, this project resonates perfectly with my interests and expertise. I bring a blend of theoretical knowledge and hands-on experience to the project, enabling me to offer valuable insights and solutions.

I'm eager to embark on this journey and contribute to the JavaScript ecosystem. This project perfectly aligns with my skills and aspirations, fueling my enthusiasm to make significant contributions and ensure its success.

Qualifications

I am a Full Stack Developer with a deep understanding of cyber security and also quite active in competitive programming,. Achieving Specialist rank on a platform Codeforces, and also being quite active in LeetCode, and CodeChef underscores my proficiency in algorithmic problem-solving. Beyond the realm of programming challenges, I am also a bug hunter. my discovery and disclosure of critical vulnerabilities in prominent organizations such as Swiggy, Indeed, and Boatzon.

  1. Swiggy: Discovered a critical threat allowing unauthorized access to restaurant accounts without OTP verification.
  2. Indeed: Uncovered two XSS vulnerabilities in profile and details pages, potentially enabling attackers to inject malicious scripts, compromise user data, and execute unauthorized actions.
  3. Boatzon: Identified a loophole allowing users to buy products at zero cost through price tampering, posing a significant risk to financial transactions and platform integrity.

The security is been reported to the companies and vulnerabilities are also been solved.

Moreover, I've assumed leadership roles, notably as Security and Backend Lead in our college's Fusion Open Source project. The project is for maintaining institutional affairs.

My experience in cyber security has been beneficial in writing code without bugs.

I've demonstrated my proficiency in utilizing web APIs and handling real-time data within a JavaScript environment, alongside a strong focus on cyber security to ensure secure project development. These experiences have equipped me with the skills necessary to excel in projects like this.

Prior art

In my research for this project, I delved into diverse resources to gain insights and grasp existing implementations thoroughly. I discovered the utility of bundling packages through tools like Webpack or Browserify, as well as the versatility of implementing code editors using libraries such as Ace Editor or Codemirror, or even building one from scratch. Additionally, I explored security measures extensively to ensure the robustness of the project.

For Bundling the packages Bundling is vital for this project as it's crucial for executing code blocks. We can reference a video demonstrating bundling, showcasing the use of the require function, along with documentation from FreeCodeCamp for detailed guidance on this task.

Bundling via Browserify: Link Freecodecamp Article: Link

For Implementing Code Editor

Ace Editor: Embedding the editor to the site Link Without Library: developing by using textbox Link

For security To ensure application security, it's crucial to have a thorough understanding of potential threats.

Xss by Portswigger: Link The PortSwigger article demonstrates various XSS attacks and their workings, serving as a guide for securing applications against such vulnerabilities.

Commitment

During my summer vacation from May to the first week of July, I will dedicate 40 hours per week to the project. Once my college resumes, I will be able to allocate 20-22 hours per week starting from that time.

Acknowledging the importance of this project, I aim to dedicate full-time hours during the summer. I'll collaborate with my mentor to brainstorm implementation strategies and task distribution, ensuring the successful achievement of project milestones.

1 May - 26 May -> Bonding Period 27 May - first week of July -> 40 hours/week ( 40 6 ) 8 July - 17 August -> 21 hours/week ( 20 6 ) Total = 240 + 120 = 360 hours

Schedule

Assuming a 12 week schedule,

Notes:

Related issues

Checklist

Pranavchiku commented 3 months ago

Hey @Shubh942, impressive proposal, thanks for applying! A few suggestions / questions I have though things look good to me.

Shubh942 commented 3 months ago

@Pranavchiku, thank you for your response. I added the column on prototype implementation and also added the necessary links, can you please review it also? Regarding the automation of this functionality, I'm actively researching and exploring various approaches to streamline the process. Any suggestions or guidance from your end would be highly appreciated. Additionally, I will also focus on my contribution part and make a good contribution to stdlib.

kgryte commented 3 months ago

@Shubh942 Thanks for sharing your draft proposal. A few comments:

  1. You've mentioned security risks. I am curious whose security we are concerned about. For code evaluation, it is happening on a user's local machine, in their web browser, and we shouldn't be performing any execution on our servers. In fact, the entire point is to leverage a user's local browser for example execution. So, I am not following the concerns about sanitation, etc.
  2. In your examples, you provide an output textarea for displaying results. Our preference would be to leverage our current doctest comment convention for displaying results. Do you have any thoughts on how you might be able to leverage those comments?
  3. One of the key problems to solve for this project is the dependency loading problem. Namely, each code block may have a different set of stdlib dependencies. And we cannot simply generate specialized bundles for each code block. And further, as users should be able to edit code blocks and dynamically require other stdlib dependencies, generating code block bundles ahead of time would not be sufficient. One approach is to use our ES module builds, which you learn more about by searching some of our standalone stdlib repositories.
Shubh942 commented 3 months ago

Thank You @kgryte for your response.

After further research and examining your suggestion to utilize ES modules, I have found a more robust approach for this task. Additionally, I have explored the repository provided by stdlib for reference on utilizing ES modules. Based on this, here's the plan I propose for approaching this project.

Prototype Video

https://github.com/stdlib-js/google-summer-of-code/assets/93862397/1c7daf5a-2e4d-4643-8339-92d8255935ea

In the above video, I defined two code blocks

  1. Import statement
  2. User code

The import function within the code block serves as the entry point for users to execute their code. Through this, we extract the names of dependencies, which users can then leverage for additional tasks. We selectively import these dependencies and integrate them with the user's code, enabling them to utilize the functions provided by stdlib seamlessly.

The way of performing this task

Initially, we can carry out the following operation for each dependency.

import dnansumpw from 'https://cdn.jsdelivr.net/gh/stdlib-js/blas-ext-base-dnansumpw@esm/index.mjs';

The above reference taken by Link

image

We are undertaking this process because the import/require functionality is not supported in web browsers. Therefore, we extract the necessary function from the export and import it from utils.js.

import importModules from "./utils.js";

After that, we can use this in our browser for code execution.

          const modules = await importModules();
           const var1 = {};
            dependencies.forEach((dependency) => {
              if (modules.hasOwnProperty(dependency)) {
                // Assign the property to the window object to make it available in the user code
                window[dependency] = modules[dependency];
                // Assign the default property of the module to var1

                const { default: dependency } = modules[dependency];
                // Assign the destructured default property to a variable with the dependency's name
                var1[dependency] = modules[dependency];
              } else {
                // If default property does not exist, use the module directly
                var1[dependency] = modules[dependency];
              }
            });

Using the method described above, we can ensure that the dependencies are made available to the user. Consequently, the user will have the flexibility to call any dependency within their code blocks as needed.

 const result = eval(userCode);
 document.getElementById("output").innerText = result; // Display the output

We've set up the dependencies as defaults so that the user's code can seamlessly utilize these functions.

Shubh942 commented 3 months ago

A practical approach to utilizing the commented output as the result is to categorize each commented output with class names like result1, result2, result3, and so forth. By doing this, we can easily access and manipulate each result through its corresponding class. Consequently, when we obtain the output, we can simply iterate over the classes in a loop and assign the respective values to each class

resultArray.forEach((element, index) => {
  const className = `class${index + 1}`;
  element.classList.add(className);

  // Set the content of the element to its class name
  element.innerHTML = className;
});
Shubh942 commented 3 months ago

The website appears to be susceptible to reflected XSS (Cross-Site Scripting) attacks, as it directly executes JavaScript code provided by the user without proper sanitization. In such a scenario, the users themselves could unknowingly execute malicious scripts provided by an attacker, leading to potential compromise of their own sensitive data.

To mitigate the risk posed by such threats, we implement sanitization measures to filter out any potentially harmful content from user input.

image