panphp / pan

A simple, lightweight, and privacy-focused product analytics php package
MIT License
1.08k stars 48 forks source link

Suggestion on how to prevent potential client-side abuse #3

Open caendesilva opened 1 month ago

caendesilva commented 1 month ago

By default, Pan only allows 50 or fewer analytics to be stored. This prevents any potential abuse of the system, as the analytics "name" are controlled on the client-side. Open to suggestions on how to improve this.

Maybe a "strict" mode could be added, where Pan scans all Blade files for data-pan attributes in the HTML, and adds them to a cached array of whitelisted analytic names. This would not work for dynamic attributes but could still be pretty cool I think. That way only names declared on the server-side are accepted.

nunomaduro commented 1 month ago

Honestly; probably a configuration file, or a Pan::analytics(['foo', 'bar']); in your service provider could be good enough.

caendesilva commented 1 month ago

Would work too!

adiologydev commented 1 month ago

What if there was a command you could use to go through the project find the tags and store them in a file?

Maybe this could be a utility command that populates Pan::analytics by default without requiring manual labour. Similarly, it could let you create/remove tags via commands as well.

coclav commented 1 month ago

I am not sure I understand why this limitation of 50 rows.

I think the DatabaseAnalyticsRepository @ increment method could be written in a single SQL query providing there is a unique key on name

insert into
    pan (name, impressions, hovers, clicks)
values
    ('admin-user-mgmt', 0, 0, 0) 
on duplicate key update
   -- add one or more of the following to update the different type
    hovers = (
        select
            p.hovers + 1
        from
            pan p
        where
            name = name
    );
nunomaduro commented 1 month ago

Can someone contribute with a pull request that adds two methods?

Pan::analytics()->max(100);
Pan::analytics()->allowed(['name-one', 'name-two']);
Pan::analytics()->unlimited();
assertchris commented 1 month ago

On it!

assertchris commented 1 month ago

Draft here: https://github.com/panphp/pan/pull/9

pxpm commented 1 month ago

Hey @nunomaduro thanks for this package, really easy to use.

I've been thinking on how to prevent abuse from the FE. It's a tricky one.

Maybe instead of directly writing the data-pan="my-button" we could create a blade directive that uses some key to validate it's a developer pan and not some malicious actor on the front-end ?

Something like:

<button @pan("my-button")>
<button @pan("my-button-2")>

That outputs something like:

<button data-pan-34j3jdslk23="my-button">
<button data-pan-asmfj3kjwe="my-button-2">

The BE can then validate the 34j3jdslk23 and asmfj3kjwe as valid tokens for pans ?

In addition to allowed('pan-name'), we could also add allowed('pan-name')->on('route.name') or something similar ? So that malicious actors can't add pans in non-relevant urls ?

I think the "tokens" solve one part of the issue, tying the pan to a route solves the second part. We are left with click abusers and page refreshers on steroids... maybe with a debounce we solve that later too 🤷

Let me know what you guys think 👍

caendesilva commented 1 month ago

Hey @nunomaduro thanks for this package, really easy to use.

I've been thinking on how to prevent abuse from the FE. It's a tricky one.

Maybe instead of directly writing the data-pan="my-button" we could create a blade directive that uses some key to validate it's a developer pan and not some malicious actor on the front-end ?

Something like:

<button @pan("my-button")>
<button @pan("my-button-2")>

That outputs something like:

<button data-pan-34j3jdslk23="my-button">
<button data-pan-asmfj3kjwe="my-button-2">

The BE can then validate the 34j3jdslk23 and asmfj3kjwe as valid tokens for pans ?

In addition to allowed('pan-name'), we could also add allowed('pan-name')->on('route.name') or something similar ? So that malicious actors can't add pans in non-relevant urls ?

I think the "tokens" solve one part of the issue, tying the pan to a route solves the second part. We are left with click abusers and page refreshers on steroids... maybe with a debounce we solve that later too 🤷

Let me know what you guys think 👍

I think the @pan directive is a very solid solution to this problem. The question is if the problem is bad enough to warrant the increased complexity of implementing this solution. It's of course up to Nuno, but I would think that it could be reasonable to hold off and see if this issue affects a lot of people, so it can be weighed against the added steps of both a new directive which IDEs may not recognize, and the added setup of having to add a secret key to the setup process.

pxpm commented 1 month ago

I think the @pan directive is a very solid solution to this problem. The question is if the problem is bad enough to warrant the increased complexity of implementing this solution. It's of course up to Nuno, but I would think that it could be reasonable to hold off and see if this issue affects a lot of people, so it can be weighed against the added steps of both a new directive which IDEs may not recognize, and the added setup of having to add a secret key to the setup process.

Hey @caendesilva thanks for the feedback. 👍

I was just shooting from the top of my head, I am not 100% sure it may be the correct solution for the problem at hand. It could be a way to make things more difficult, but you can just copy the data-pan-s3cr3t around the page .. so it would probably need some extra layer of enforcing the uniqueness of the pan.

I will hold for sure, I trust Nuno judgment way more than mine 🙏

Once again, thanks for your input and for raising the question, I think it's something that warrants some brainstorming to try to solve. No one would like to have skewed analytics to take on business decisions 👍

Cheers