eyeonus / Trade-Dangerous

Mozilla Public License 2.0
97 stars 31 forks source link

eddblink: avoid importing duplicate entries - RESCUE SHIPs #90

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hi there :-) could we check for duplicates (especially with rescue ships?) while importing data/eddb/the stations.jsonl? Currently I have a lot of duplicates interfering with "trade run" like:

TradeDangerous.prices:4272923 ERROR Second occurrance of station "COALSACK SECTOR KN-S B4-9/RESCUE SHIP - BERING PORT", previous entry at line 4272903.
TradeDangerous.prices:4273783 ERROR Second occurrance of station "HAKI/RESCUE SHIP - SHERRILL ORBITAL", previous entry at line 4273711.
TradeDangerous.prices:4273930 ERROR Second occurrance of station "WELLINGTON/RESCUE SHIP - EZRA POINT", previous entry at line 4273738.

These errors occur while executing "trade run" (multiple tries with manual edits to the statiosn.jsonl). A "trade import -P eddblink -O clean" or "trade import -P eddblink -O force,all" obviously does not help as the duplicates are reimported from stations.jsonl ;-)

Can someone confirm this behavior? Or suggestions how to fix that on the commandline?

Thanks for this awesome tool!

eyeonus commented 3 years ago

.... You can check yourself using any SQLite database browser.

I myself use "SQLite Database Browser" (Portable Version - no installation), but really any will do. . . . Stations and systems are stored according to their FDevID, which for stations is stored under the 'station_id' tag, and should be unique for every station, so not sure what's going on....

I would need to get a copy of your database to check myself, it's basically everything in the /data folder.

I'd like you to zip up the data folder, and then do a 'clean' run of the eddblink plugin, and try that run again, see if the error pops up.

If it does, then upload the zipped folder to something, Google Drive is preferred, so I can look at it myself.

I would also like the 'run' command the problem occurs in.

ghost commented 3 years ago

Sure: trade run +hops_s0_4x2.txt +cutter_banshee.txt --from lhs6031/polan

I just tried trade import -O clean,fallback and it seems to work now.

ghost commented 3 years ago

Oh, the .tdrc_run contains:

-J
-vvv
--progress
--summary
--color
--age=1.5
--credits=200m
--mgpt=5m
--gain-per-ton=8000
--margin=0.2
--avoid=dromi,matet,otegine,sol,wolfsegen,sirius
--fleet-carrier=N
--planetary=N
--prune-hops=3
--prune-score=12.5
--ls-max=0
--ls-penalty=7.5
ghost commented 3 years ago

I had logged this behavior in my house-internal bugtracker on 2021-03-10 and today. Maybe there's some redundant data in Tromador's dataset - or was. It feels like it occurs only during a specific timeframe when rescue ships seems to "move" during CGs.

Or maybe just "stupid user" ;-)

eyeonus commented 3 years ago

Okay, next time it happens, backup the database as is and make it available to me, along with the full output of the run that found the problem. Best and easiest way to do this is to add -w > td.issue.90.log 2>&1 to the end of the actual run and redo the run, then copy paste the log that gets made, like so:

example.log:

# Command line was: ['..\\trade.py', 'import', '-P', 'eddblink', '-O', 'all,skipvend', '-vvv', '-w']
# Loading Database. 2021-03-23 13:54:33.731292
# Database loaded.
NOTE: Checking for update to 'modules.json'.
NOTE: Downloading file 'modules.json'.
NOTE: Requesting https://elite.tromador.com/files/modules.json
.
.
.

If the output is extremely long, then link to it on something like Pastebin if possible or on something like Mega if needed.

ghost commented 3 years ago

Understood. Will keep you posted if that happens again. Thanks a lot.

ghost commented 3 years ago

Ok, today it happened again. I deleted the whole "data" directory to have a clean start. All scripts, logs and data directory But still encountered the duplicate entry. Hope this helps.

eyeonus commented 3 years ago

Alright. I will take a look and see what can be done about this.

ghost commented 3 years ago

Thank you :-) But could still be a stupid me issuing the wrong commands and breaking things. I am still investigating here to see if I can find a pattern on my side.

ghost commented 3 years ago

Just some notes while I'm investigating: It seems to randomly break just at some point running "trade import -P eddblink". Sometimes it can be fixed by a "trade import -P eddblink -O purge". Moved the tradedangerous directory to a local SSD now. As you can see in the logs it is normally run from an SMB share. Maybe that is a bad combination with the SQL database.

ghost commented 3 years ago

I reconfigured my Samba file server to do only synchronous writes. By default this is set to "aio write size = 1" which basically enables asynchronous writes. I set this in my /etc/smb.conf to "aio write size = 0" and forced synchronous writes now. As far as I can tell it looks stable now. Reference: https://www.samba.org/samba/docs/4.11/man-html/smb.conf.5.html#AIOWRITESIZE

ghost commented 3 years ago

Ticket can be closed. Setting aio write size = 0 in /etc/smb.conf resolves the issues with inconsistencies/duplicates for me.

eyeonus commented 3 years ago

Okay then.