processing / p5.js-web-editor

The p5.js Editor is a website for creating p5.js sketches, with a focus on making coding accessible and inclusive for artists, designers, educators, beginners, and anyone else! You can create, share, or remix p5.js sketches without needing to download or configure anything.
https://editor.p5js.org
GNU Lesser General Public License v2.1
1.3k stars 1.27k forks source link

Convert `project.controller` to `async`/`await` syntax #3025

Closed lindapaiste closed 4 months ago

lindapaiste commented 4 months ago

Ref #521 Ref #3024

Changes:

Improvements: Often when a refactor a part of the code I find things that are not right. Here are some specific problems that I fixed:

I have verified that this pull request:

@raclim It is much easier to review the changes with the "Ignore whitespace" option. I finally figured out how to do that on GitHub! image

Swarnendu0123 commented 4 months ago

I have reviewed the file and I have few suggestions:

  1. The updateProject function catches errors with a generic message and returns a 400 status code. It's better to provide more specific error messages or log the errors for debugging purposes.

  2. The forEach loop in bundleExternalLibs is asynchronous due to the usage of async inside, but it's not awaited. This means it might not execute in the expected order. We can consider using for...of loop with await inside or use Promise.all() to wait for all asynchronous tasks to complete.

  3. Inside the loop in bundleExternalLibs, the code manipulates the project.files array multiple times, which can be inefficient for large arrays. We can consider using temporary arrays or objects to store data and manipulate them outside the loop for better performance.

  4. The forEach loop in bundleExternalLibs contains an async function, but it doesn't await anything inside the loop. Since the loop doesn't depend on any asynchronous operations, it's unnecessary to make it async.

  5. addFileToZip, when fetching a file's content via Axios, error handling is present, but it only logs errors to the console. We can consider propagating errors back to the caller or providing a fallback mechanism if a file cannot be fetched.

  6. The buildZip function contains synchronous operations like format, which could potentially block the event loop, especially if it's handling multiple requests concurrently. We can consider optimizing synchronous operations or moving them out of the critical path.

  7. The buildZip function loads all project files into memory before creating the zip file. This could lead to memory issues for large projects. So, we can consider streaming file contents directly into the zip file to reduce memory usage.

lindapaiste commented 4 months ago

Thank you for the detailed and thoughtful review! This helps a lot more than a "LGTM".

Most of the things that you are mentioning are in existing code that I didn't change. There's always a balance between changing too much at once and wanting to fix everything. If I'm trying not to change too much usually it's because I don't want to overwhelm the reviewer. But since you're here and you're looking at this then let's do it!

I have reviewed the file and I have few suggestions:

  1. The updateProject function catches errors with a generic message and returns a 400 status code. It's better to provide more specific error messages or log the errors for debugging purposes.

I'm not sure if we want to send specific mongodb errors back in API responses. In a normal web app that could be a security concern so I would send a somewhat vague error back to the user but log a specific error for developers to review. Since we're open source the security is probably not an issue since anyone can already see out database structure. I'm also not sure what sort of persistent logging capabilities we have. Only @raclim has access to the production environment. There was a recent PR #2814 which sets up Winston, which honestly I'm not too familiar with. At my work we use Sentry which is a third-party web app.

The 400 Bad Request is probably not the best error code. I think the ones we want are 500 Internal Server when there's an unexpected database error and 404 Not Found (or possibly 422 Unprocessable Entity) if the project doesn't exist. We can check for a 404 up on line 24 when we query for the project.

It seems weird that we use Project.findByIdAndUpdate on line 43 when we already did Project.findById on line 24. Probably we started out with the combined Project.findByIdAndUpdate call and then added the earlier check. I imagine it's more performant to do an update on the project object than to find again so that's something that we can potentially clean up.

  1. The forEach loop in bundleExternalLibs is asynchronous due to the usage of async inside, but it's not awaited. This means it might not execute in the expected order. We can consider using for...of loop with await inside or use Promise.all() to wait for all asynchronous tasks to complete.

Good thoughts but not needed due to 4.

  1. Inside the loop in bundleExternalLibs, the code manipulates the project.files array multiple times, which can be inefficient for large arrays. We can consider using temporary arrays or objects to store data and manipulate them outside the loop for better performance.

Feel free to raise a PR with a specific improvement that you have in mind.

  1. The forEach loop in bundleExternalLibs contains an async function, but it doesn't await anything inside the loop. Since the loop doesn't depend on any asynchronous operations, it's unnecessary to make it async.

Yes! I noticed this and I was going to change it and then I decided to keep it as-is so that I wasn't changing too much. But that's dumb as this really should be changed. Will fix.

  1. addFileToZip, when fetching a file's content via Axios, error handling is present, but it only logs errors to the console. We can consider propagating errors back to the caller or providing a fallback mechanism if a file cannot be fetched.

This topic came up in an issue that I worked on recently where you couldn't download a sketch if it contained deleted files. I wasn't sure the best way to handle it. I settled on returning a 0kb file. I think that we do want to download what we can, rather than aborting the entire download, but I agree that it would be nice if there was some way to warn the user that their download contains missing files. See PR #2315 issue #2314.

  1. The buildZip function contains synchronous operations like format, which could potentially block the event loop, especially if it's handling multiple requests concurrently. We can consider optimizing synchronous operations or moving them out of the critical path.

  2. The buildZip function loads all project files into memory before creating the zip file. This could lead to memory issues for large projects. So, we can consider streaming file contents directly into the zip file to reduce memory usage.

This is definitely a concern of mine! @mhsh312 has been working on downloading entire collections (#2735 and #2707) and I am worried that it might be too much data. Maybe you can work on this? It's issue #643.

Swarnendu0123 commented 4 months ago

Thanks for your review. I would love to work on this.

I am a student from India, and currently my university exams are going on. I would start to work on this by 23rd Jan, if we are not in sprint.

When I had reviewed it, almost all the implementations are in my mind. I will raise the PR on this branch. Then you can merge my PR with this PR, and this PR will be ready.

Swarnendu0123 commented 4 months ago

I got some free time, I thought I should find the solutions,

It seems weird that we use Project.findByIdAndUpdate on line 43 when we already did Project.findById on line 24. Probably we started out with the combined Project.findByIdAndUpdate call and then added the earlier check. I imagine it's more performant to do an update on the project object than to find again so that's something that we can potentially clean up.

It's a valid observation. Indeed, it seems redundant to perform a separate findByIdAndUpdate call on line 43 after already querying the project using findById on line 24. Consolidating these operations into a single database update would likely improve performance and simplify the codebase.

Here's how we can refactor the updateProject function to eliminate the redundant database call:

export async function updateProject(req, res) {
  try {
    const project = await Project.findById(req.params.project_id).exec();
    if (!project) {
      res.status(404).send({
        success: false,
        message: 'Project with that id does not exist.'
      });
      return;
    }
    if (!project.user.equals(req.user._id)) {
      res.status(403).send({
        success: false,
        message: 'Session does not match owner of project.'
      });
      return;
    }
    if (
      req.body.updatedAt &&
      isAfter(new Date(project.updatedAt), new Date(req.body.updatedAt))
    ) {
      res.status(409).send({
        success: false,
        message: 'Attempted to save stale version of project.'
      });
      return;
    }
    // Update the project directly without another database query
    Object.assign(project, req.body);
    const updatedProject = await project.save();
    res.json(updatedProject);
  } catch (error) {
    console.error(error);
    res.status(500).json({ success: false });
  }
}

By updating the project object retrieved from the database and then saving it, we avoid the need for an additional database query. This consolidation streamlines the code and improves performance by reducing database interactions.

Feel free to raise a PR with a specific improvement that you have in mind.

Okey, I will raise the PR.

This topic came up in an issue that I worked on recently where you couldn't download a sketch if it contained deleted files. I wasn't sure the best way to handle it. I settled on returning a 0kb file. I think that we do want to download what we can, rather than aborting the entire download, but I agree that it would be nice if there was some way to warn the user that their download contains missing files. See PR #2315 issue #2314.

Handling errors when fetching a file's content via Axios in the addFileToZip function is crucial for providing a robust and reliable file download experience. Propagating errors back to the caller or providing a fallback mechanism can help ensure that users are informed about any issues with file downloads.

Here's how we can enhance the error handling in the addFileToZip function:

/**
 * Recursively adds a file and all of its children to the JSZip instance.
 * @param {object} file
 * @param {Array<object>} files
 * @param {JSZip} zip
 * @return {Promise<void>} - modifies the `zip` parameter
 */
async function addFileToZip(file, files, zip) {
  if (file.fileType === 'folder') {
    const folderZip = file.name === 'root' ? zip : zip.folder(file.name);
    await Promise.all(
      file.children.map((fileId) => {
        const childFile = files.find((f) => f.id === fileId);
        return addFileToZip(childFile, files, folderZip);
      })
    );
  } else if (file.url) {
    try {
      const res = await axios.get(file.url, {
        responseType: 'arraybuffer'
      });
      zip.file(file.name, res.data);
    } catch (error) {
      console.error(`Error fetching file '${file.name}' from '${file.url}':`, error);
      // Provide a fallback mechanism or notify the user about the error
      // For example, we can create an empty file or log the error and proceed
      zip.file(file.name, new ArrayBuffer(0)); // Create an empty file
    }
  } else {
    zip.file(file.name, file.content);
  }
}

In this updated version of the function, we catch errors that occur during the file fetch operation using Axios. If an error occurs, we log the error to the console for debugging purposes, and then we provide a fallback mechanism by creating an empty file in the zip archive. This ensures that the file download process continues even if some files cannot be fetched successfully.

We can customize the fallback mechanism based on your specific requirements and user experience considerations. Providing feedback to users about any missing or inaccessible files is important for transparency and user satisfaction.

This is definitely a concern of mine! @mhsh312 has been working on downloading entire collections (#2735 and #2707) and I am worried that it might be too much data. Maybe you can work on this? It's issue #643.

Cool.