pucardotorg / dristi

MIT License
2 stars 12 forks source link

Upload faulty files #1776

Closed atulgupta2024 closed 7 hours ago

atulgupta2024 commented 1 month ago

Rename a MOV file to PDF and upload where PDF is required. Also work with smaller video files, wav files to ensure that the file is smaller than file size limit

also try uploading a shell script code file. it should not execute on server side

atulgupta2024 commented 2 weeks ago

can be tested fairly quickly. must be done before go live

@suresh12 @Ramu-kandimalla

atulgupta2024 commented 2 weeks ago

@Ramu-kandimalla here are some additional guidelines for this. can be shared with the team

To prevent malicious file uploads, especially those where users rename files to bypass restrictions, consider these measures:

  1. File Type Validation:

    • MIME Type Checking: Check the file's MIME type on the server side. Even if a MOV file is renamed to a PDF, its MIME type (e.g., video/quicktime vs. application/pdf) will reveal the true format.
    • Magic Number Inspection: Look at the file’s "magic number" (first few bytes of the file). Each file type has a unique identifier in these bytes, making it harder for renamed files to bypass validation.
  2. Content Scanning:

    • Library-Based Validation: Use a library specific to each file type (like PDFBox for PDFs) to validate the structure of the uploaded file. If the file structure doesn’t match the expected format, reject it.
  3. Restrict File Size:

    • Set a file size limit for each type to avoid oversized malicious files.
  4. Sanitize and Secure File Storage:

    • Store files outside of the web-accessible directories and avoid direct access to uploaded files through their URLs.

These checks can help ensure that your application only accepts genuine files of the expected type, even if users attempt renaming tricks.

Here's sample magic numbers and sample code that can be used

Many file types use a "magic number" (specific byte sequences at the beginning of the file) to identify their format, regardless of the file extension. Here are some common file types and their magic numbers:

File Type Magic Number (Hexadecimal)
PDF %PDF (0x25 0x50 0x44 0x46)
JPEG 0xFF 0xD8 0xFF 0xE0
PNG 0x89 0x50 0x4E 0x47
GIF 0x47 0x49 0x46 0x38
ZIP/Docx/PPTx 0x50 0x4B 0x03 0x04
MP4 0x00 0x00 0x00 0x18 0x66 0x74 0x79 0x70 0x6D 0x70 0x34 0x32
MOV 0x00 0x00 0x00 0x14 0x66 0x74 0x79 0x70 0x71 0x74

Java Code to Validate Magic Numbers

Here’s a sample Java code that reads the first few bytes of a file and checks the magic number against expected values:

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

public class FileMagicNumberValidator {

    // Define magic numbers for file types
    private static final byte[] PDF_MAGIC = {0x25, 0x50, 0x44, 0x46}; // %PDF
    private static final byte[] JPEG_MAGIC = {(byte) 0xFF, (byte) 0xD8, (byte) 0xFF, (byte) 0xE0};
    private static final byte[] PNG_MAGIC = {(byte) 0x89, 0x50, 0x4E, 0x47}; // PNG

    public static void main(String[] args) {
        File file = new File("path/to/your/file.pdf");
        try {
            if (isValidPDF(file)) {
                System.out.println("The file is a valid PDF.");
            } else {
                System.out.println("The file is not a valid PDF.");
            }
        } catch (IOException e) {
            System.out.println("Error reading the file: " + e.getMessage());
        }
    }

    // Function to validate if a file is a PDF based on magic number
    public static boolean isValidPDF(File file) throws IOException {
        return hasMagicNumber(file, PDF_MAGIC);
    }

    // General function to check if file has specific magic number
    private static boolean hasMagicNumber(File file, byte[] magicNumber) throws IOException {
        try (FileInputStream fis = new FileInputStream(file)) {
            byte[] fileBytes = new byte[magicNumber.length];
            if (fis.read(fileBytes) != fileBytes.length) {
                return false; // File is too short to have the magic number
            }
            // Compare magic number with file bytes
            for (int i = 0; i < magicNumber.length; i++) {
                if (fileBytes[i] != magicNumber[i]) {
                    return false;
                }
            }
            return true;
        }
    }
}

Explanation

This code can be extended to support other file types by adding corresponding byte arrays for each file type’s magic number.

Ramu-kandimalla commented 1 week ago

Hi @atulgupta2024 Please confirm the severity for golive

radheshjoshi1 commented 3 days ago

pr is merged in dev env

radheshjoshi1 commented 3 days ago

dev testing is done for it, qa can test it @Ramu-kandimalla

bhuvanyuguru commented 7 hours ago

HI @Ramu-kandimalla, @rajeshcherukumalli Closing this Ticket, please refer to #2401