nextcloud / suspicious_login

Detect and warn about suspicious IPs logging into Nextcloud
GNU Affero General Public License v3.0
84 stars 25 forks source link

Insufficient data: No recent data available #174

Open jcheger opened 5 years ago

jcheger commented 5 years ago

I did install the plugin on 2 sites. One did work as expected, but the second one is stuck.

Any help how to get out of this would be welcome. Any file or db table to delete ?

ChristophWurst commented 5 years ago

Could you try again? Do you have both ipv4 and ipv6 data? What version of the app do you use?

jcheger commented 5 years ago

Nextcloud 16.0.5 Suspicious Login 1.0.0 IPv4 only

Still the same result: Not enough data, try again later (Insufficient data: No recent data available)

ChristophWurst commented 5 years ago

That is strange. Could you run an SQL query to count the number of rows in oc_login_address_aggregated that have a first_seen larger than the unix timestamp from a week ago?

The only case where you might not have new IPs for the last week is when your IPs never change. But that seems unlikely.

jcheger commented 5 years ago
MariaDB [nextcloud]> SELECT id,seen,
    ->   DATE_FORMAT(FROM_UNIXTIME(first_seen),'%Y-%m-%dT%TZ') as first_seen,
    ->   DATE_FORMAT(FROM_UNIXTIME(last_seen),'%Y-%m-%dT%TZ') as last_seen
    ->   FROM oc_login_address_aggregated
    ->   WHERE first_seen>DATE_SUB(NOW(), INTERVAL 1 WEEK);
Empty set, 44 warnings (0.00 sec)

I don't know what the records in this table mean. However, I did logout/login in a web browser, and restarted the client on a machine, without any modification in this table (neither the last_seen column).

FYI, I use TOTP on my own, but I also have a Synology that syncs in webdav. One of my colleague also syncs his Synology, but not sure he use the client. Users are also authed in LDAP (Active Directory).

If you have a doubt on my request, here is the content of the table:

MariaDB [nextcloud]> SELECT id,seen,
    ->   DATE_FORMAT(FROM_UNIXTIME(first_seen),'%Y-%m-%dT%TZ') as first_seen,
    ->   DATE_FORMAT(FROM_UNIXTIME(last_seen),'%Y-%m-%dT%TZ') as last_seen
    ->   FROM oc_login_address_aggregated;
+----------+----------+----------------------+----------------------+
| id       | seen     | first_seen           | last_seen            |
+----------+----------+----------------------+----------------------+
|        1 | 29307778 | 2019-06-04T22:34:36Z | 2019-10-09T06:41:36Z |
|      648 |    30123 | 2019-06-04T22:37:11Z | 2019-10-09T02:55:32Z |
|   215970 |       18 | 2019-06-05T15:43:41Z | 2019-09-27T15:13:51Z |
|   461456 |        3 | 2019-06-06T12:37:05Z | 2019-06-06T13:38:26Z |
|   564536 |        4 | 2019-06-07T21:49:51Z | 2019-06-07T21:57:41Z |
|  1537240 |        4 | 2019-06-11T11:59:12Z | 2019-06-11T11:59:13Z |
|  2160305 |        4 | 2019-06-14T09:52:23Z | 2019-06-14T10:40:49Z |
|  4678419 |       10 | 2019-06-23T19:45:52Z | 2019-06-25T19:41:16Z |
|  4884910 |      532 | 2019-06-24T10:17:24Z | 2019-10-08T08:59:55Z |
|  6286938 |       22 | 2019-06-28T14:21:34Z | 2019-07-06T13:25:16Z |
|  6664333 |     1317 | 2019-06-29T17:52:47Z | 2019-06-29T19:29:01Z |
|  6932598 |       26 | 2019-06-30T12:06:36Z | 2019-06-30T12:06:55Z |
|  8461734 |      104 | 2019-07-12T10:14:57Z | 2019-10-07T19:57:56Z |
|  9462170 |        2 | 2019-07-15T15:37:51Z | 2019-07-15T15:37:51Z |
|  9491559 |        2 | 2019-07-30T16:57:41Z | 2019-07-30T16:57:41Z |
|  9865499 |        2 | 2019-07-31T17:33:21Z | 2019-07-31T17:33:21Z |
| 12189113 |        3 | 2019-08-07T16:30:22Z | 2019-09-03T11:16:40Z |
| 12433925 |        4 | 2019-08-08T09:38:24Z | 2019-09-03T15:31:29Z |
| 13613275 |        2 | 2019-08-12T10:10:24Z | 2019-08-12T10:10:24Z |
| 13982567 |        3 | 2019-08-13T10:14:15Z | 2019-08-13T15:43:00Z |
| 14338698 |        3 | 2019-08-14T10:04:00Z | 2019-09-05T19:11:05Z |
| 14446679 |        2 | 2019-08-14T22:08:39Z | 2019-08-14T22:08:39Z |
| 14491331 |        2 | 2019-08-18T18:51:33Z | 2019-08-18T18:51:33Z |
| 14775786 |        2 | 2019-08-19T17:08:13Z | 2019-08-19T17:08:13Z |
| 15064891 |        3 | 2019-08-20T13:36:23Z | 2019-08-20T13:43:03Z |
| 15105664 |        6 | 2019-08-20T16:16:07Z | 2019-08-26T17:42:29Z |
| 17149344 |        2 | 2019-08-26T11:37:13Z | 2019-08-26T11:37:13Z |
| 17244033 |        2 | 2019-08-26T17:50:48Z | 2019-08-26T17:50:48Z |
| 18222581 |        7 | 2019-08-29T13:04:13Z | 2019-09-23T10:13:30Z |
| 19597374 |        2 | 2019-09-02T10:14:29Z | 2019-09-02T10:14:29Z |
| 19996955 |        4 | 2019-09-06T09:05:14Z | 2019-09-10T08:24:08Z |
| 20025304 |       79 | 2019-09-06T15:17:35Z | 2019-10-09T03:28:32Z |
| 20057593 |        2 | 2019-09-06T22:13:21Z | 2019-09-06T22:13:21Z |
| 20561952 |        3 | 2019-09-12T13:20:51Z | 2019-09-13T12:54:36Z |
| 20659650 |        3 | 2019-09-13T10:53:53Z | 2019-09-13T11:02:03Z |
| 21006513 |        2 | 2019-09-16T14:00:32Z | 2019-09-16T14:00:32Z |
| 21118706 |        5 | 2019-09-17T13:34:24Z | 2019-09-18T13:55:06Z |
| 22025968 |        2 | 2019-09-25T13:52:14Z | 2019-09-25T13:52:14Z |
| 22028864 |        2 | 2019-09-25T14:31:02Z | 2019-09-25T14:31:02Z |
| 22129515 |        2 | 2019-09-26T14:26:17Z | 2019-09-26T14:26:17Z |
| 22190039 |        5 | 2019-09-27T07:34:53Z | 2019-09-27T23:42:03Z |
| 22203054 |        2 | 2019-09-27T10:43:51Z | 2019-09-27T10:43:51Z |
| 22571308 |        2 | 2019-10-01T14:33:50Z | 2019-10-01T14:33:50Z |
| 22596178 |        2 | 2019-10-01T22:13:25Z | 2019-10-01T22:13:25Z |
+----------+----------+----------------------+----------------------+
44 rows in set (0.00 sec)
ChristophWurst commented 5 years ago

I don't know what the records in this table mean. However, I did logout/login in a web browser, and restarted the client on a machine, without any modification in this table (neither the last_seen column).

The login data is not directly fed into that table. It first goes into oc_login_address and a background job updates the oc_login_address_aggregated asynchronously.

If you have a doubt on my request, here is the content of the table:

That is indeed strange. Do you use some sort of proxy in front of Nextcloud? Does Nextcloud even see the client IPs?

ChristophWurst commented 5 years ago

I don't know what the records in this table mean

It's basically a compressed version of oc_login_address, in which every login is stored as a row. The aggregated data uses a counter to groups identic (uid,ip) tupes. The timestamps show when a (uid,ip) was used first and last. In your case this compressed 30M entries into <50 rows ;)

jcheger commented 5 years ago

This instance of nextcloud is the only one I have without a reverse proxy. Instead, I have a NAT 1:1 configured in a pfsense (means that there is a dedicated IP address for this service, which is also used for outgoing traffic).

The 50 rows are not such a surprise. We are only few users, usually connecting from the same IP addresses.

ChristophWurst commented 5 years ago

The problem here is: the current logic tries to split collected data into two sets: training data and validation data. Validation data is the IPs that have only been seen in the last week. The idea behind this is to give a metric of how well the model reacts to historically new data. If your IPs hardly ever change, there won't be anything new recently.

This is a conceptual problem. I'm not sure if this is solvable easily.

diyoyo commented 1 year ago

So basically, your saying that the use of this app is irrelevant in case the instance is safe and only used by a few users? What if there is one big attacker in these early stages of the nextcloud instance?

Honestly, I believe hackers have better to do than target ultra-small teams, so if this add-on is not useful in that particular case, I'd rather disable it to avoid Warnings in the log section.

It keeps telling me that the models are not present (Could not predict suspiciousness: No models found) or that there is not enough data.