EDSM-NET / FrontEnd

Issues tracker for EDSM
https://www.edsm.net/
37 stars 0 forks source link

API: database dumps 'systemsWithCoordinates7days.json.gz' much smaller than 'systemsWithoutCoordinates.json.gz' #468

Closed GrimmerSchnitter closed 2 years ago

GrimmerSchnitter commented 2 years ago

Greetings Commanders,

it has come to our attention, that there must be an issue regarding the size and volume of the database dumps containing the systems with coordinate for the last 7 days, aka "systemsWithCoordinates7days.json.gz" available from the nightly dumps grafik

I wonder if this fact is already known to you.

EDIT: The dump mentioned above only has 220242 individual lines of code, which seems a little low...

best regards,

der grimme Schnitter

klightspeed commented 2 years ago

EDIT: The dump mentioned above only has 220242 individual lines of code, which seems a little low...

At 220242 systems per week, it would take 315 weeks (about 6 years) to get the 69402979 systems with coordinates in EDSM. EDSM has been running since about May 2015 (almost 350 weeks).

72% of the 1868063 systems in the systemsWithoutCoordinates.json.gz file are from before June 2016 (when coordinates were added to the netlog by Frontier) and 78% are before October 2016 (when the player journal was introduced by Frontier). Someone would need to visit these systems and submit their data to EDSM, or trilaterate them, in order for EDSM to get coordinates for them.

GrimmerSchnitter commented 2 years ago

OK, thanks for showing your point. I was just wondering about the difference compared to other data sources (spansh has about 670000 systems in the same time period). But if you say you're OK with 220000 systems, I have to live with that.

fly save and stay courious

der grimme Schnitter

klightspeed commented 2 years ago

OK, thanks for showing your point. I was just wondering about the difference compared to other data sources (spansh has about 670000 systems in the same time period). But if you say you're OK with 220000 systems, I have to live with that.

EDSM's 7 days dump only includes systems where the system itself was updated in the last 7 days, while Spansh's 7 days dump also includes systems where any bodies or stations in those systems have been updated in the last 7 days.

Spansh also includes systems from submitted routes, not just visited systems. From the events that have come through EDDN, including systems in routes submitted via EDDN, 377609 new systems would have been added between 31 Jan 2022 and 07 Feb 2022, of which 241190 would have only been present in submitted routes ( and so not included in EDSM's systems data).

GrimmerSchnitter commented 2 years ago

Thank you for clearing that.

spansh commented 2 years ago

It's actually a little more nuanced that this. You're exactly correct in that my dumps contain any system where any piece of data has changed within the last x days (which can include a fleet carrier jumping in). However EDSM gets roughly 150,000 to 200,000 systems a month which I don't get (until I import from EDSM which I do once a month) due to some players only submitting data to EDSM directly and not to EDDN.

However, if people wanted to confirm the coordinates of those systems in systemsWithoutCoordinates, the best way to do that with everything we have in place currently, is to have something like EDMC running, and then to plot a route to those systems one by one. Those would then be submitted to EDDN where EDSM (and everyone else) can then add them to their databases compete with coordinates and main star type.