Closed eSilverStrike closed 2 years ago
Recording the last IP address will be useful, but this is related to issue #438 so we should be very careful.
Agree, I had actually thought of that but didn't realize there was a Github issue for it already. Geeklog has a few areas (like speedlimit) where we would have to figure out how to handle the IP issue.
Currently, Geeklog records IP addresses in these tables:
table | column |
---|---|
comments | ipaddress |
commentsumissions | ipaddress |
likes | ipaddress |
pollvoters | ipaddress |
sessions | remote_ip |
spamx | value (if spamx.name === 'IP') |
speedlimit | ipaddress |
trackback | ipaddress |
It is very hard to anonymize IP addresses using these tables as they are. So, I would suggest creating a new table ips
like this:
name | data_type |
---|---|
seq | INT AUTO_INCREMENT |
ip | VARCHAR(39) NOT NULL |
item_type | VARCHAR(30) NOT NULL |
item_id | VARCHAR(30) |
created | DATETIME |
Suppose we save a comment. First, we save a comment to the comments
table with the ipaddress column being blank. Then, we save the real IP address to the ips
table like:
INSERT INTO ips (ip, item_type, item_id, created) VALUES ('readl IP', 'comment', 'comment_id', CURRENT_TIMESTAMP);
After a given time has passed, we anonymize the IP addresses stored in ips
by way of psedo CRON feature executed at the end of lib-common.php
.
How does this sound?
We should also include and API so plugins like the Forum could use it as well.
I see 2 ways of doing this
1) Using your "ips" table except we would need to add an extra column called "subtype" like in the "likes" table just in case a plugin has a number of different items users can add. We could then remove the "ipaddress" column from the other tables unless you want to use it for the "seq" column?
OR
2) Store the "seq" id instead of the "ipaddress" in the comments, likes, etc... tables and then the "ips" table would just contain
seq ip created
I am not sure if we need "item_trype", item_id" in the "ips" table... will we ever have to look at the "ips" table and figure out where the ip is from? (maybe at some point we will?)
I think your second suggestion is better.
I tried implementing the IP anonymization feature with the above idea, but found it requires changing a lot of existing code, in particular, speedlimit and comments. I'm now thinking of keeping the current tables and adding an option of whether to anonymize IP addresses immediately or not. Any thought?
I think I rather deal with the original issue of this feature request so we can eventually add better spam protection and then create a new issue with the anonymization feature. I think doing the anonymizing the right way the first time is a better approach (it is the better approach right?). The other big thing (besides creating an API so other plugins like the forum can use it) with creating a new table is the work it will take during upgrade. We will have to go through the comment table and take the ips and add it to the ip table and then take the id's and update the comment table, etc... incase the admin doesn't ever want to anonymize anything.
Anonymizing IPs right away would limit checking for spam wouldn't it? What are the other disadvantages?
I think I rather deal with the original issue of this feature request so we can eventually add better spam protection and then create a new issue with the anonymization feature.
I agree. I will reopen the issue #438 or create a new one.
Anonymizing IPs right away would limit checking for spam wouldn't it? What are the other disadvantages?
Yes, it would limit checking for spam. One of the other disadvantages of not anonymizing IPs right away is how we should deal with the IPs to be recorded in log files. It would be quite hard to anonymize such IPs later.
Good point. For speed reasons I think those would have to be anonymized right away just like they would be in the webservers logs.
Implemented with change set 8d316a9f. Actually, I added the user's IP address to all email messages sent to the site admin.
When a new user is created (or added to the submission queue) or updated it would be nice to include the IP address of the user in the email.
The IP address can then be used by the Admin to determine if they want to ban the IP or not (if for example he got 3 new users in the last 30 minutes all from the same IP).
We may also want to record the last IP used by the User as well in the User tables so this information can be searched since the only way to figure this out is looking at the server logs. This will help Admins handle SPAM and fake user accounts easier.