eyeonus / Trade-Dangerous

Mozilla Public License 2.0
96 stars 31 forks source link

Fix the need to generate prices - pretty please with sugar on top. #141

Open Tromador opened 2 months ago

Tromador commented 2 months ago

I saw the little apology - thankyou.

However. To import live took 1 minute 18 seconds, including the time for "Optimizing database ..." Regenerating .prices took 2 minutes and 7 seconds.

Importing a live therefore took (on this occasion) 3 minutes and 25 seconds, nearly 2/3rd (62% to be precise) of which time was making the prices file.

(I actually just did another 2 imports where importing live was less than a minute, and most of that was optimizing, the actual import was about 5 seconds, .prices was about the same almost to the second, thus prices is taking twice as long as the import).

So I've put it as low priority on purpose. I don't want or expect a quick fix, but please fix it eventually. I remain unconvinced of the value of exporting .prices except for an edge case where the server has to import a full set of data from some source and that's the only available dataset.

More urgently, however, please let me know what I need to do to revert the listener, as we changed it to use prices options.

eyeonus commented 2 months ago

Comment out

                    if config['side'] == 'server':
                        trade.main(('trade.py', 'import', '-P', 'eddblink', '-O', 'prices'))

I've already pushed the fix as well.

eyeonus commented 2 months ago

Whenever TD decides it needs to "rebuilding cache, this may take a few moments", if there isn't any .prices file, it will destroy the market data.

TD decides it needs to do that surprisingly often. So until we get around to divorcing TD of its need of the prices file (which is planned), we're going to have to keep exporting it every time.

Tromador commented 2 months ago

For a long time, I was simply running the server and my concerns were limited to "is the listener running", is "live getting overly large", "how is the database performing". Entirely abstracted away from the users who actually had to get information out of the thing.

I did reinstall Elite when we got a working client again, at the time just because I had a bit of a feelgood about our efforts and wanted to run around doing a bit of trading because I could. However being a user has opened my eyes and given me a different perspective on TradeDangerous and that is summarised by "a lot of waiting about".

TD seems to do a lot of things, as you put it "surprisingly often" and for a live import, it's those things which take up the bulk of the time, rather than the actual insertion of data into the DB. The actual import phase of a live takes less time than it took to download. I mean literally 5-10 seconds for the progress bar to zoom from left to right. The rest of the 3 minutes is optimisation, .prices file and whatever else it does.

It's important to get the import time down, because importing a full listings.csv still takes a chunk of time, but that's a one and done perhaps at the start of a session that day. After that we still need to consider how we can improve the user experience.

I do understand there's only so many hours in the day, heaven knows I've put in a ton of hours over the past couple of weeks and you've (@eyeonus) done probably three times what I have. There's got to be prioritisation and we have lives and responsibilities of which TD is somewhere down the list (though with the time it's eating, one would wonder). With all that said, user experience kinda sucks and it's something we are going to have to address.

eyeonus commented 2 months ago

Oh, I completely agree. In all truth, I think we've done enough on the import side of things (other than the prices file bit), and should start concentrating on the actual usage. This includes but is not limited to doing away with the prices file (and the other cache files) entirely.

From a usage perspective, what I noticed myself doing was, before I undocked to start my current run, I would input the ending station and ending credits (minus ~10% of the start vs. finish diff. to be safe) for a new run calculation and have that go in the background while I was doing the run it had just given me. It would usually have a new run waiting for me by the time I docked at the final station. Not the best solution, I'll admit.

I didn't add your input to my list of priorities, but I would say it's definitely next on the list, maybe N=-19

Tromador commented 2 months ago

Just as a thought - we shouldn't completely remove support for .prices at least for import. EDMC still supports output to this format and is good both for testing and for Commanders to collect their own private data.

eyeonus commented 2 months ago

Oh no, just removing internal dependence on TradeDangerous.prices itself, not removing support for the format.

Tromador commented 2 months ago

Yup. Good. Thought just entered my brain and I rather raise it and not need to. :)

kfsone commented 2 months ago

The format exists to allow you to transcribe what you see at the market screen, and during beta when there were less stations than I weigh in pounds, it didn't hurt having one large file like that.

There's zero reason we couldn't replace it with the csv format for the main "pre-cache" and keep the parser for anyone who actually wants to transcribe the Market Screen into notepad/textedit/gvim to update their data.

I don't think we've needed the pre-cache in a very long time, we don't eff the sql db up often enough, and if we can make import fast enough, nobody should care - and we could add backup/restore commands (I mean, they'd just be sugar around cp).

On the import side, my import-via-table branch suggests that the real kicker there is building the sell/buy price indexes, and those exist because some dickwad named Oliver whoever transcribed the item-line from the .prices file into the StationItem table instead of making two tables. What a loser.

I tried dumbing those indexes down to just price-based, and that made the run command nearly unusable (it takes about 2 minutes on my pc to finish loading tradeable items).

I haven't tried making those fields nullable and ensuring that a price=0 is a price=null. There's a 50/50 chance that might eliminate 75% of the import cost and make run workable again. https://github.com/kfsone/Trade-Dangerous/tree/kfsone/import-via-table if you want to experiment, but I'm gonna be working extra hours today and tomorrow likely so may not get chance to do the experiment myself.

Split StationItem into Supply/Demand tables and you won't need those conditional indexes.

kfsone commented 2 months ago

Another possible solution would be to completely dumb-down the sqlite data, get rid of the indexes entirely, and have our own long-running "tradeos" (os: operating system. my initials? ogs, why do you ask? :grin: ) that could implement things like my bitmap idea without needing an on-disk persistent store for them... and the trade commands simply talk to it via whatever mechanism you want.

The route to that might be to create a "repl" sub-command which stays resident, and then iterate that into a version that accepts commands from the network or something.

# trade.py
def main(...):
  backend = connect_to_existing_backend(tdenv)
  if backend is None:
    try_to_fork_and_become_the_backend()
    time.sleep(1)  # the backend has 1 job: to accept and answer requests, even if it answers 'hold on'.
    backend = connect_to_existing_backend(tdenv)

  with Timeout(n) as timeout:
    backend.send(args)

  with Timeout(n) as timeout:
    for out, err in backend.read():
      timeout.reset()
      if out:  rich.print(out)
      if err: sys.stderr.write(f"{err}\n")
eyeonus commented 2 months ago

There's zero reason we couldn't replace it with the csv format for the main "pre-cache" and keep the parser for anyone who actually wants to transcribe the Market Screen into notepad/textedit/gvim to update their data.

That was my idea as well.

I don't think we've needed the pre-cache in a very long time....

Since manually entering information is not really done nowadays, I agree. I just don't want to go through the effort of figuring out how to make it stop doing that. ;)

eyeonus commented 2 months ago

Another possible solution would be to completely dumb-down the sqlite data, get rid of the indexes entirely, and have our own long-running "tradeos" (os: operating system. my initials? ogs, why do you ask? 😁 ) that could implement things like my bitmap idea without needing an on-disk persistent store for them... and the trade commands simply talk to it via whatever mechanism you want.

The route to that might be to create a "repl" sub-command which stays resident, and then iterate that into a version that accepts commands from the network or something.

# trade.py
def main(...):
  backend = connect_to_existing_backend(tdenv)
  if backend is None:
    try_to_fork_and_become_the_backend()
    time.sleep(1)  # the backend has 1 job: to accept and answer requests, even if it answers 'hold on'.
    backend = connect_to_existing_backend(tdenv)
    with Timeout(n) as timeout:
      backend.send(args)
    with Timeout(n) as timeout:
      for out, err in backend.read():
        timeout.reset()
        if out:  rich.print(out)
        if err: sys.stderr.write(f"{err}\n")

I think that would be perfect if we had a decent GUI, since the "tradeos" could be started when the GUI was, and stay running until the thing was closed, but for CLI, how would it know when to shut down the "tradeos", other than the client manually shutting it down?

kfsone commented 2 months ago

Once you have a backend like that, it should be agnostic - you can access it from either the client or the gui. Irritatingly, this is one of the things that is actually rather good about Perforce, it was built for GUI and their client/server protocol reflects that because you don't get error codes, you get values to populate your dialog with, and instead of "disk error" there is "your disk is on fire", "that's not a disk, THIS is a disk", "it's spelled disc", "your authentication provider hasn't paid their gas bill so your cookies are doughies" super-specific errors (but they're also not arbitrary, there is a specific message for each case so you don't actually have to go writing regexes).

As for shutdown:

a. we would probably introduce a "tracerc" file in toml or yaml format, b. it could specify an "idle shutdown" time, but that's likely going to be an hour or something, it shouldn't be actively using cpu or memory when you're not talking to it, c. a 'quit' command that the cli/gui can send, d. if there's a problem in the gui, it could have the ability to "restart backend" which is just literally sending the current one a quit command and then doing the fork-and-become itself,

I'd personally want it to have the ability to create a task tray icon, and there must be a way to do that in python, but last time I looked was probably python 2.7 9 years ago :) then

e. "exit tradeos"

kfsone commented 2 months ago

Additional thought, there: a- the first time a user runs a command that triggers the 'fork' thing, it would check the 'rc file' and see if "user-knows-we-run-in-the-background" or something is true. If it is not, we say

"Hey friend, I'm gonna do this thing ... you ok with that? [link to deets]" and if they say no, then instead of forking, we run the tradeos stuff in-process, and essentially talk "to ourselves".

(implementationwise that's not that difficult; you would want the tradeos proper to have a "listener" thread to keep it responsive, and as long as its written somewhat agnostically, it won't care if you give it a pipe-file-descriptor to talk between itself rather than a socket to talk to a different process)

eyeonus commented 2 months ago

System tray icon: PYSTray

I didn't look into it, but a quick Google search showed that's the library we want to use.

kfsone commented 2 months ago

🤞🏽

Tromador commented 2 months ago

The format exists to allow you to transcribe what you see at the market screen, and during beta when there were less stations than I weigh in pounds, it didn't hurt having one large file like that.

If you remember, we were using OCR of market screenshots, there was no CAPI, EDDN or similar gubbins that we now have, we were all individually uploading bits of data to Maddavo. This is why things are stupid, because in the "before time", they were sensible and nobody wanted to do the work to update it[1] to work with the current much improved data availability. So not only was there a lot less data[2], the way to get data into the system was much more laborious and by necessity was human readable, so that OCR errors could be corrected in Notepad (or whatever). If we were designing TD now, lots of things would be done differently.

[1] Not that anyone should be obliged to, it's all labours of love at the end of the day. [2] For gargantuan values of 'a lot'.