vatplanner / importer-status-postgis

VATPlanner Status Importer to PostGIS
MIT License
0 stars 0 forks source link

connections may split #1

Closed dneuge closed 4 years ago

dneuge commented 4 years ago

Only one pilot connection should be recorded per session. Reading the graph from database relies on that assumption. However, sessions have become split over multiple connections with an apparent inconsistent mix-up of records.

Query to find split sessions:

select *
from connections c
where c.connection_id in (
    select distinct(c1.connection_id)
    from connections c1
    left outer join connections c2 on c1.vatsimid = c2.vatsimid and c1.firstreport_id = c2.firstreport_id and c1.logontime = c2.logontime
    where c1.connection_id != c2.connection_id
)
order by vatsimid, logontime, connection_id;
dneuge commented 4 years ago

Suggestion: Clear database for fresh start, then add a unique constraint over vatsimid and logontime. Retry import and investigate failing inserts to identify root cause.

dneuge commented 4 years ago

Caused by vatplanner/dataformats-vatsim-public#7

dneuge commented 4 years ago

Remember that connections have a different meaning for ATC and duplications are allowed in that case. The unique constraint must not be made permanent or import of ATC facilities will currently fail.

Check if separating facility uptime from connections would be a good idea.

dneuge commented 4 years ago

After fixing vatplanner/dataformats-vatsim-public#7 fetching reports from database is now incomplete:

2020-06-30T14:57:57,390 - DEBUG -                             Database - read 1603 reports from database
2020-06-30T14:57:57,418 - WARN  -                             Database - caught exception during DB transaction, rolling back
java.lang.RuntimeException: connection ID 4555481: report (first) with ID 217426 has not been loaded
        at org.vatplanner.importer.postgis.status.database.Database.lambda$null$7(Database.java:461) ~[status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.database.Database.query(Database.java:186) ~[status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.database.Database.lambda$loadReportsSinceRecordTime$20(Database.java:442) ~[status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.database.Database.lambda$performTransactional$0(Database.java:132) ~[status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.database.Database.withConnection(Database.java:109) [status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.database.Database.performTransactional(Database.java:120) [status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.database.Database.loadReportsSinceRecordTime(Database.java:251) [status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.StatusImport.importNextChunk(StatusImport.java:117) [status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.Main.run(Main.java:63) [status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
        at org.vatplanner.importer.postgis.status.Main.main(Main.java:40) [status-importer-postgis-0.1-SNAPSHOT-jar-with-dependencies.jar:?]
dneuge commented 4 years ago

Also, rerunning the query from above still returns duplicate connections. Needs further investigation.

dneuge commented 4 years ago

Modified query to only show pilot connections by hiding facilities (ATC):

select c.*
from connections c
left outer join facilities f on c.connection_id = f.connection_id
where c.connection_id in (
    select distinct(c1.connection_id)
    from connections c1
    left outer join connections c2 on c1.vatsimid = c2.vatsimid and c1.firstreport_id = c2.firstreport_id and c1.logontime = c2.logontime
    where c1.connection_id != c2.connection_id
)
and f.connection_id IS NULL
order by vatsimid, logontime, c.connection_id;
dneuge commented 4 years ago

Checked some cases again and all appear to mix pilot and ATC connections. The query from above still reports connections if

Since pilot connections are now deduplicated with vatplanner/dataformats-vatsim-public#7 no further action should be needed at this time.

It might make sense to somehow split ATC facilities from connections, however.

dneuge commented 4 years ago

Little "fun" fact: ToS violations apparently lead to both clients eventually being kicked from the network by supervisors, as expected. In most cases self-observing pilots do reconnect both their clients a few times (also sort of expected...) which leads to a large amount of cases reported by above query. I'm not sure why logon times are identical though - looks like proxies are being used to achieve those multi-connections unless it's just another data file oddity.