mosparo / mosparo

The modern spam protection. Protects your forms from spam by simply checking the content. Open source, Free to use, Accessible, and Self-Hosted.
https://mosparo.io
MIT License
183 stars 12 forks source link

Auto verify? #166

Closed Geremia closed 9 months ago

Geremia commented 9 months ago

Is there a way (besides tampering with the database itself) to mark each submission as automatically verified by default so I don't have to implement server-side verification?

zepich commented 9 months ago

Hi @Geremia

Thank you very much for your question.

No, there is no way to do that. It does not make sense to do that since if you don't verify the submission in the backend/server-side, the spam protection is not working (it doesn't matter which spam protection, mosparo or reCAPTCHA or other).

The problem is that you can never trust the frontend. If you only integrate the frontend part of mosparo, the user can submit the form with spam.

Why do you not want to integrate the backend verification?

Kind regards,

zepich

Geremia commented 9 months ago

@zepich

if you don't verify the submission in the backend/server-side, the spam protection is not working

Not entirely. Honeypot will still mark SPAM those submissions that have a non-null honeypot field, for example.

zepich commented 9 months ago

Hi @Geremia

Not entirely. Honeypot will still mark SPAM those submissions that have a non-null honeypot field, for example.

Yes, that's correct. mosparo will also mark a submission as spam if it matches a rule (without verification).

The logic with the list of submissions in mosparo is the following:

A submission will be visible if it is detected as spam (honeypot or matching rules) or verified (successfully or not). This is because you have to see a spam submission (since the user could not submit it, maybe it's a real submission, not a spam one, and your rules or settings are too strong, for example). If the submission was verified, you must be able to see the submission, too. One may have been spam, and you need to see what the data of the submissions were. Or the verification went wrong because of the minimum time setting, and you should adjust that.

But why do we not show the valid but not verified submissions? The reason for that is simple: if a user fills out the form and checks the mosparo checkbox, the user hasn't submitted the form. You haven't got any information from the user yet. Let's say you fill out the form and check the mosparo box but then decide not to submit the form. The website owner should not see these data - they were not submitted. mosparo automatically deletes these submissions after 24 hours to get rid of the user data.

The problem with mosparo without verification is that the user can delete the mosparo box from the form in the browser and submit the form without checking the mosparo box. mosparo will not validate anything in that case, and the user can submit whatever the user wants (because the user has no mosparo field to click).

Additionally, suppose the user submits the form without JavaScript or by directly sending the POST request. In that case, your backend will accept the POST without verifying if the form was submitted correctly from the frontend.

Even if the bad actor does not remove the mosparo box, it is possible to bypass it. For that, the bad actor fills out the form without spam and clicks the mosparo checkbox. Afterward, the bad actor can manipulate the form (add the spam content) and submit it. Since no backend verification is happening, the spam content will be submitted, and the spam protection will not work.

For all these possible bypasses, we need the backend verification to ensure the form data were validated by mosparo and not changed after the mosparo validation.

Geremia commented 9 months ago

@zepich

But why do we not show the valid but not verified submissions? The reason for that is simple: if a user fills out the form and checks the mosparo checkbox, the user hasn't submitted the form. You haven't got any information from the user yet.

What if invisible mode is used?

zepich commented 9 months ago

@Geremia Yes, in that case, the user submitted the form so we could display the data in mosparo.

But still, using mosparo without backend verification is no protection at all. With some simple clicks (for example, disable JavaScript or remove the mosparo box from the form), you can completely remove the protection and submit as much spam as you want.

What is the problem for you with the backend verification? Do we not offer an integration for the system you use? If yes, please tell me what system you're using so we can look at preparing an integration.

Geremia commented 9 months ago

@zepich Some more elaboration is needed in the "Performing verification" documentation. I don't understand how to trigger verification.

zepich commented 9 months ago

Hi @Geremia

Okay, thank you very much for your feedback.

So let's start with an simple example. Let's assume we have a small contact form with a text field for the name, one for the email address, and one textarea for the message. Below the message we add the mosparo box. When the user submits the form, we send this data by email to our inbox.

image

Your backend code for this form looks something like this:

(I use PHP here as an example, in other languages it looks a bit different of course but the logic is the same)

<?php

// Get the form data
$formData = $_POST;

// Validate the form data
if (!validateFormData($formData)) {
    // If the form data is not valid, show an error message
    echo 'Your form data are not valid.';
    exit;
}

// If everything is valid, send the email.
mail('info@example.com', 'Contact form message', 'Hello webmaster, here is a contact form message .........');

The method validateFormData will validate the form data by checking for valid email addresses and so on. I do not put it in this example, but it will validate the fields.

Starting the verification

So now, we have to add the mosparo verification. You have different places where you can do that, depending on the system and language you use (see Systems and verification below). But the most important thing is to execute the verification before you process the submitted form data. You may have a more complex use case than in my example, where you store the form data in your database or use it on a 3rd party API. Before you do any of this, you had to do the verification. We recommend verifying the form data with mosparo before you even validate the form data. If the mosparo verification fails, you do not need to validate the form data since they are invalid. But, of course, you can do the mosparo verification in the same step as the validation, for example.

To keep our example from above, I will do it before the validation:

<?php

// Get the form data
$formData = $_POST;

// Verify the form data with mosparo
if (!verifyFormDataWithMosparo($formData)) {
    // General error message, we don't know the exact reason for the failed verification here
    echo 'The form data contains spam.';
    exit;
}

// Validate the form data
if (!validateFormData($formData)) {
    // If the form data is not valid, show an error message
    echo 'Your form data are not valid.';
    exit;
}

// If everything is valid, send the email.
mail('info@example.com', 'Contact form message', 'Hello webmaster, here is a contact form message .........');

Systems and verification

Depending on your system, you cannot simply add a verification step as I did in my example above. For example in our WordPress plugin, we add mosparo like a standard form field and wait for the validation method to be called. But if you're using a framework or a system that offers events, you have to find the right one to execute the verification in such an event.

Executing the verification

Of course, we just added a method but did not define the method. So now, we have to define the verification method:

<?php

function verifyFormDataWithMosparo(array $formData)
{
    // 1. Remove the ignored fields from the form data
    // 2. Extract the submit and validation token from the form data
    // 3. Prepare the form data
    // 4. Generate the hashes
    // 5. Generate the form data signature
    // 6. Generate the validation signature
    // 7. Prepare the verification signature
    // 8. Collect the request data
    // 9. Generate the request signature
    // 10. Send the API request
    // 11. Check the response 
}

This looks like a lot of work but in reality, it sounds more complicated than it is. Let's start at the beginning:

1. Remove the ignored fields from the form data

mosparo does not validate field types like checkbox, radio, password, and hidden. There are more ignored fields, which you can find on this list here: https://documentation.mosparo.io/docs/integration/ignored_fields/

You have to remove these from the form data since we did not validate thes fields.

2. Extract the submit and validation token from the form data

mosparo automatically adds the submit and validation token to your form data. So you should have these to values in your form data. For that, extract the two values and store them in a variable:

$submitToken = $formData['_mosparo_submitToken'];
$validationToken = $formData['_mosparo_validationToken'];

3. Prepare the form data

Now, we have to cleanup the form data. For this we have to iterate over the form data. If the field name starts with _mosparo_ we must remove this value from the form data. We have to replace CRLF line breaks with LF line breaks for all other fields.

$preparedFormData = [];
foreach ($formData as $fieldName => $value) {
    if (str_starts_with($fieldName, '_mosparo_')) {
        continue;
    }

    $preparedFormData[$fieldName] = str_replace("\r\n", "\n", $value);
}

4. Generate the hashes

Since we do not want to transfer the plain-text form data to mosparo, we create hashes. For that, we iterate over the array of the prepared form data and create a SHA256 hash for every value. You can combine steps 3 and 4 into one loop, but I keep it separated here to tell you better what to do. Also, please sort the array alphabetically by the field name in ascending order (A-Z).

foreach ($preparedFormData as $fieldName => $value) {
    $preparedFormData[$fieldName] = hash('sha256', $value);
}

ksort($preparedFormData);

5. Generate the form data signature

Now, we create a signature to prove the validity of the prepared form data. For this, we convert the prepared form data into a JSON string and then create an HMAC SHA256 hash with the project's private key.

$jsonPreparedFormData = json_encode($preparedFormData);
$projectPrivateKey = 'abc.........'; // You can find this value in the project settings in mosparo
$formDataSignature = hash_hmac('sha256', $jsonPreparedFormData, $projectPrivateKey);

6. Generate the validation signature

With the same method as in step 5, we create the signature of the validation token (an HMAC SHA256 hash):

$validationSignature = hash_hmac('sha256', $validationToken, $projectPrivateKey);

7. Prepare the verification signature

To later confirm the response from mosparo, we create a verification signature. The signature is the combination of the validation and the form data signature as an HMAC SHA256 hash.

$combinedSignatures = $validationSignature . $formDataSignature;
$verificationSignature = hash_hmac('sha256', $combinedSignatures, $projectPrivateKey); 

8. Collect the request data

We have prepared the form data and generated the signatures, so we can now prepare the API request for the verification API. For that, we prepare the request data, which we need to contact the verification API:

$apiEndpoint = '/api/v1/verification/verify'; // This is the API of mosparo, so it's a fixed value
$requestData = [
    'submitToken' => $submitToken,
    'validationSignature' => $validationSignature,
    'formSignature' => $formDataSignature,
    'formData' => $preparedFormData,
];

9. Generate the request signature

To authenticate the request, we need a request signature. We create another HMAC SHA256 hash with the combination of the API endpoint and the request data as a value.

$jsonRequestData = json_encode($requestData);
$combinedApiEndpointJsonRequestData = $apiEndpoint . $jsonRequestData;
$requestSignature = hash_hmac('sha256', $combinedApiEndpointJsonRequestData, $projectPrivateKey);

10. Send the API request

We have prepared all the necessary values and can contact the mosparo API. For that we need a HTTP client to make the request to the API. I'm using the PHP library Guzzle to make my request, but of course, you can use any other client. The request to the API is a POST request, and you must add the public key and the request signature in the Authorization header (as Basic authorization header, encoded as Base64 string). The request data must be sent as the post data fields of the request.

$projectPublicKey = '987654....'; // You can find this value in the project settings in mosparo
$client = new \GuzzleHttp\Client([
    'base_uri' => 'https://mosparo.example.com', // The host of your mosparo installation
]);
$response = $client->request(
    'POST',
    $apiEndpoint,
    [
        'auth' => [$projectPublicKey, $requestSignature],
        'form_params' => $requestData,
    ]
);

11. Check the response

The request was sent, and we received a response. Now it's time to check the result of the verification. For that, decode the returned JSON string from the API. If the verification was processed correctly (without HTTP error messages), then in the response from mosparo, you should have the following fields: valid, verificationSignature, verifiedFields, and issues.

If the field valid is set to true and the field verificationSignature contains the same value as the prepared verification signature in step 7, then the form data are valid, and you can process the data. If valid is not true or the verification signature is not the same, then something was wrong with the request (or the user tried to manipulate it), and is therefore rated as spam.

There is one additional crucial step to do. mosparo can only validate what it received in the frontend and what you sent in the backend. The user could change a required field in the browser to an ignored field for mosparo and bypass mosparo with it. After successful verification, you should ensure all your required fields are verified. For this, mosparo returns the array with the verified fields. Make sure, that all your fields are set in there:

$responseData = json_decode((string) $response->getBody(), true);

if (isset($responseData['valid']) && $responseData['valid'] && isset($responseData['verificationSignature']) && $responseData['verificationSignature'] == $verificationSignature) {
    // Make sure that all required fields were verified by mosparo
    if (!isset($responseData['verifiedFields']['name']) || !isset($responseData['verifiedFields']['emailAddress']) ||  !isset($responseData['verifiedFields']['message'])) {
        return false;
    }
    return true;
}

return false;

Complete function

Now the complete function to execute the verification looks like this:

<?php

function verifyFormDataWithMosparo(array $formData)
{
    // 1. Remove the ignored fields from the form data
    // You have to do this only if you have ignored fields in your form

    // 2. Extract the submit and validation token from the form data
    $submitToken = $formData['_mosparo_submitToken'];
    $validationToken = $formData['_mosparo_validationToken'];

    // 3. Prepare the form data
    $preparedFormData = [];
    foreach ($formData as $fieldName => $value) {
        if (str_starts_with($fieldName, '_mosparo_')) {
            continue;
        }

        $preparedFormData[$fieldName] = str_replace("\r\n", "\n", $value);
    }

    // 4. Generate the hashes
    foreach ($preparedFormData as $fieldName => $value) {
        $preparedFormData[$fieldName] = hash('sha256', $value);
    }

    ksort($preparedFormData);

    // 5. Generate the form data signature
    $jsonPreparedFormData = json_encode($preparedFormData);
    $projectPrivateKey = 'abc.........'; // You can find this value in the project settings in mosparo
    $formDataSignature = hash_hmac('sha256', $jsonPreparedFormData, $projectPrivateKey);

    // 6. Generate the validation signature
    $validationSignature = hash_hmac('sha256', $validationToken, $projectPrivateKey);

    // 7. Prepare the verification signature
    $combinedSignatures = $validationSignature . $formDataSignature;
    $verificationSignature = hash_hmac('sha256', $combinedSignatures, $projectPrivateKey); 

    // 8. Collect the request data
    $apiEndpoint = '/api/v1/verification/verify'; // This is the API of mosparo, so it's a fixed value
    $requestData = [
        'submitToken' => $submitToken,
        'validationSignature' => $validationSignature,
        'formSignature' => $formDataSignature,
        'formData' => $preparedFormData,
    ];

    // 9. Generate the request signature
    $jsonRequestData = json_encode($requestData);
    $combinedApiEndpointJsonRequestData = $apiEndpoint . $jsonRequestData;
    $requestSignature = hash_hmac('sha256', $combinedApiEndpointJsonRequestData, $projectPrivateKey);

    // 10. Send the API request
    $projectPublicKey = '987654....'; // You can find this value in the project settings in mosparo
    $client = new \GuzzleHttp\Client([
        'base_uri' => 'https://mosparo.example.com', // The host of your mosparo installation
    ]);
    $response = $client->request(
        'POST',
        $apiEndpoint,
        [
            'auth' => [$projectPublicKey, $requestSignature],
            'form_params' => $requestData,
        ]
    );

    // 11. Check the response 
    $responseData = json_decode((string) $response->getBody(), true);

    if (isset($responseData['valid']) && $responseData['valid'] && isset($responseData['verificationSignature']) && $responseData['verificationSignature'] == $verificationSignature) {
        // Make sure that all required fields were verified by mosparo
        if (!isset($responseData['verifiedFields']['name']) || !isset($responseData['verifiedFields']['emailAddress']) ||  !isset($responseData['verifiedFields']['message'])) {
            return false;
        }
        return true;
    }

    return false;
}

After the verification

If the verification was successful, you can now process the form data as you did before, for example, sending it by email or storing it in a database.

Additional information

We'll refactor the documentation, probably with this text here, if it explains the verification better than the existing documentation. Additionally, we've added a verification simulation mode in the next bigger mosparo version (v1.1), which will be released in January 2024. With this, you can see these steps with the actual values you should have to verify the data (all the signatures and how they were created).

Please let me know if you have any questions or need better explanations.

Thank you very much for your help!

Kind regards,

zepich

Geremia commented 9 months ago

@zepich My form is purely HTML:

<form id="contact-form" method="post">

with an event listener to prevent navigation away from the page:

document.getElementById('contact-form').addEventListener('submit', function (e) {
     e.preventDefault();
});

Could I trigger verification somehow in this submit event listener function?

zepich commented 9 months ago

@Geremia Okay, so you have a contact form in HTML.

But somewhere, you must process the data, like submitting it by email or storing it in a database, right? Where do you do that, and how?

Geremia commented 9 months ago

@zepich Well, at the moment I'm just relying on Mosparo store the requests in the database. It seems my JavaScript function above could simply send an XHR request to a PHP script to do the processing, then.

zepich commented 9 months ago

@Geremia Okay, now I understand what you want to do. In general, the idea of mosparo is not to use it as a storage for the submissions. mosparo stores the submission for a short time (14 days) only.

I understand your idea of using mosparo to store the data (and delete it automatically after 14 days). If you want to do that, you have to use a small cronjob, for example, to set the field valid to 1 for all submissions. Something like UPDATE submission SET valid = 1, verified_at = NOW() WHERE valid IS NULL; should work.

But full spam protection is only possible with verification. And the verification must happen in the backend code, like a PHP script. To perform the verification, you have to put the private key in your code, and if you do that in the frontend, a bad actor can see that and use it for other things.

So, as you mentioned, you have to do an XHR request to a PHP script in the JavaScript event handler you posted before. In this PHP script, you have to execute the verification as written in my long (I'm sorry for the length ;) ) comment above before you store the data in a database or send the data to you by email.

Geremia commented 9 months ago

@zepich Is there a way to store the submissions permanently or change the 14 days to something else?

Geremia commented 9 months ago

@zepich I don't have Guzzle. I'm trying it this way:

$query = http_build_query($requestData);
$options = [
    'http' => [
        'header' => ['Content-type: application/x-www-form-urlencoded',
                     'Authorization: Basic ' . base64_encode("$projectPublicKey:$requestSignature")],
        'method' => 'POST',
        'content' => $query
    ],
];
$context = stream_context_create($options);
$response = file_get_contents($baseURL . $apiEndpoint, false, $context);

But my response is:

{
   "error" : true,
   "errorMessage" : "Request invalid."
}
zepich commented 9 months ago

Is there a way to store the submissions permanently or change the 14 days to something else?

Not an official one. You can always change mosparo's code and turn off the cleanup (see src/Helper/CleanupHelper.php). But the idea of mosparo is to scan the form data and store them for 14 days to allow the website owner to see what got blocked accidentally. mosparo is officially not built to store the data forever.

I don't have Guzzle. I'm trying it this way:

Your code is good. I've tested it with my demo form, and it's working perfectly.

This message (Request invalid) usually occurs because the request signature was incorrect. Have you used my code from the long comment above 1:1, or have you changed it? I've tested it with my code from above, and there are no issues.

Can you verify that you used the correct public and private keys?

Can you show the full (verification) code of your script?

Geremia commented 9 months ago

@zepich I send the XHR request as follows:

var formData = new FormData( document.getElementById('contact-form') );
var xhr = new XMLHttpRequest();
xhr.open('POST', 'verify.php', true);
xhr.send(formData);

verify.php (with my public and private keys redacted):

<?php
$formData = $_POST;

$submitToken = $formData['_mosparo_submitToken'];
$validationToken = $formData['_mosparo_validationToken'];

$preparedFormData = [];
foreach ($formData as $fieldName => $value) {
    if (str_starts_with($fieldName, '_mosparo_')) {
        continue;
    }

    $preparedFormData[$fieldName] = str_replace("\r\n", "\n", $value);
}

foreach ($preparedFormData as $fieldName => $value) {
    $preparedFormData[$fieldName] = hash('sha256', $value);
}

ksort($preparedFormData);

$jsonPreparedFormData = json_encode($preparedFormData);
$projectPrivateKey = '<REDACTED>';
$formDataSignature = hash_hmac('sha256', $jsonPreparedFormData, $projectPrivateKey);

$validationSignature = hash_hmac('sha256', $validationToken, $projectPrivateKey);

$combinedSignatures = $validationSignature . $formDataSignature;
$verificationSignature = hash_hmac('sha256', $combinedSignatures, $projectPrivateKey); 

$apiEndpoint = '/api/v1/verification/verify';
$requestData = [
    'submitToken' => $submitToken,
    'validationSignature' => $validationSignature,
    'formSignature' => $formDataSignature,
    'formData' => $preparedFormData,
];

$jsonRequestData = json_encode($requestData);
$combinedApiEndpointJsonRequestData = $apiEndpoint . $jsonRequestData;
$requestSignature = hash_hmac('sha256', $combinedApiEndpointJsonRequestData, $projectPrivateKey);

$projectPublicKey = '<REDACTED>';
$options = [
    'http' => [
        'header' => ['Content-type: application/x-www-form-urlencoded',
                     'Authorization: Basic ' . base64_encode("$projectPublicKey:$requestSignature")],
        'method' => 'POST',
        'content' => http_build_query($requestData),
    ],
];
$context = stream_context_create($options);
$response = file_get_contents('https://mysite/mosparo/public/index.php' . $apiEndpoint, false, $context);
echo $response;
zepich commented 9 months ago

Hi @Geremia

I'm sorry for the delay. The last couple of days were hectic.

I've tested your code with my test setup, and it worked perfectly. I then saw that you have /mosparo/public/index.php as the base path for the API request, and I set up mosparo in the same structure. When I do that, I get the same error as you had.

The problem here is that if you do that, you must also add the base path to the request signature. So, in this line:

$combinedApiEndpointJsonRequestData = $apiEndpoint . $jsonRequestData;

You have to specify the full path, so something like this will work:

$combinedApiEndpointJsonRequestData = '/mosparo/public/index.php' . $apiEndpoint . $jsonRequestData;

But:

We do not recommend this setup. We officially recommend setting up mosparo on a domain (example.com) or subdomain (mosparo.example.com (mosparo can also be something else)) and setting the document root of the virtual host to the public directory of mosparo. The setup with a subdirectory like you have can work, but it can lead to such problems as you have, and in the worst of the worst case, it can expose too much data from mosparo. For example, the var directory with the cache is publicly accessible, and if something is stored there, it could be leaked.

I hope this helps you to verify the submission correctly. Please let me know if you have any other questions or error messages.

Kind reagrds,

zepich

Geremia commented 8 months ago

@zepich That fixed it. Thanks & merry Christmas! 🎄 🙏🏻 👶 🌠

zepich commented 8 months ago

@Geremia Absolutely no problem, you're welcome!

Thank you for letting me know. Please let me know if you need any other help!

Thank you. Merry Christmas!

Geremia commented 7 months ago

@zepich

You can always change mosparo's code and turn off the cleanup (see src/Helper/CleanupHelper.php).

Such as my commenting out all the DELETE queries?

zepich commented 7 months ago

@Geremia

Yes, you could either add a return; at the start of the cleanup() method (https://github.com/mosparo/mosparo/blob/master/src/Helper/CleanupHelper.php#L29) so it will not execute anything of the cleanup process at all or you could commenting out some of the DELETE queries.

You could also increase the time after the submissions were deleted (search for 14D and adjust it to whatever value you want).

But please remember that these changes only last until you update to a newer version of mosparo.