eehab-saadat / PlagPatrol

The utlimate responsive web based OCR-embedded plaigirism checker which generates PDF plaigirism reports and supports handwritten documents. Created as an end-of-semester project.
3 stars 0 forks source link

Add PDF Report Generator #3

Closed eehab-saadat closed 1 year ago

eehab-saadat commented 1 year ago

Add PDF Report Generator

Your task is to create a reporter.py file in the ./utils/ folder which will act as a Python module to fetch the utilities to be utilized within the core app.py file for PDF plagiarism report generation. Within this file you are to implement a generate_report function which may or may not be accommodated by any (although minimal) number of helper functions, if needed. The sample function signature is as follows:

def generate_report(filename: str, plagIndex: float, results: Dict[str, str], wordCount: int, charCount: int, downloadFolder: str) -> None:
    """
    Generate a PDF plagiarism report based on the given information and save it in the '/tmp/' directory.

    Args:
        filename: A string denoting the original filename.
        plagIndex: A float representing the percentage of plagiarized content (as a value between 0 and 1).
        results: A dictionary with string-string key value pairs, where the keys represent the original phrases,
                      and the values represent the plagiarized phrases.
        wordCount: An integer representing the total number of words in the submitted document.
        charCount: An integer representing the total number of characters in the submitted document.
        downloadFolder: A string denoting where to store the report. This will be the '/tmp/' folder and is also where the original file                    
                      would be stored.

    Returns:
        None
    """

Here it is illustrated that the function will be receiving a total of 6 parameters, where the first two are values from the unpacked tuple returned from the check_plagiarism function implemented in the /utils/scrapper.py file.

Examples

{"Never gonna give you up" : "https://www.youtube.com/watch?v=dQw4w9WgXcQ"} # (found)

or

{"Batman's parents" : ""} # (not found)

- `wordCount` and `charCount` would represent the total number of words and characters in the document for utilization in the report header.
- The `downloadFolder` would be a string representing the location of the original file and where the report is to be saved.

You have to figure out and design a report generator which creates a report with a user-friendly and eligible report structure, using the provided parameters, or more if needed. Some key ideas are to include the logo first most (contact @1Zain for the logo), then present the meta data, maybe some graphical statistics such as a chart, then display original text with highlighted or hyperlinked plagiarized phrases and add a references section for report plagiarism sources list extracted from the `results` dict. You can alter the structure as per feasibility and creativity of ideas with the acceptability criteria strictly being usability and eligibility.

This function has no return type, hence is a procedure whose task is to solely create the report and save it inside the `downloadFolder` directory. You may also include a line where after the final output (report gen) the original file might be deleted to save resource as it may no longer be needed. 

Follow a standard naming convention for the generated report to establish generality, for example, as follows:
```py
f"{filename}_plagiarism_report.pdf"

However, this and how to reconstruct the original file or how to structure the report, you are to figure out and design this, as this task has been allotted to you, using the provided information. You are to also add relevant exception handling and submit a final stable build.

In a gist, you are to perform the following:

Feel free to contact by using the comment section below for any further queries or requests related to this issue.

Note: You are to create a branch of name add-report-generator with its source as main and push your commits to only that branch and NOT to main.

Once your feature is finished, comment down below to let me know so that I can merge it with main.