EtienneLamoureux / sc-trade-companion

Companion application for SC Trade Tools
https://sc-trade.tools
GNU General Public License v3.0
29 stars 7 forks source link

Fix decimal parsing #73

Closed larhape closed 1 month ago

larhape commented 4 months ago

2024-06-02_14-20-11-740871300

As shown in the screenshot, some prices appear with decimal values. For example, in this case "Janalite", "Medical Supplies" and "Pitambu". When the image is processed by SC-Trade-Companion, the output price is not correct, here is an extract of the CSV, corresponding to this screenshot:

seraphim station,"BUYS","janalite","105.0","0","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","medical supplies","2000.0","485","very high inventory","2024-06-02 14:20:30" seraphim station,"BUYS","pitambu","997000.0","0","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","processed food","193.0","0","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","prota","999000.0","5","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","ranta dung","9000.0","0","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","revenant pod","3000.0","0","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","stims","379.0","0","out of stock","2024-06-02 14:20:30" seraphim station,"BUYS","hephaestanite","1000.0","0","out of stock","2024-06-02 14:20:30"

The Janalite is reported with a price of "105.0", but it should le "17675991" (ie: 17.67... M aUEC/unit) The Medical supplies is 2000, but it should be 1794 (ie: 1.794 k aUEC/unit) Pitambu should be 1649 or 1650 and NOT 997000

etc.

I suppose the OCR is only getting a few numbers before the "/UNIT" at the end and ignore the rest when it is too long. This explains that "1.79400002K/UNIT" is cut into "002K/UNIT" for Medical Supplies which means 2000. For Prota, it is the same, the "1.37999999K/UNIT" is cut to "999K/UNIT" and becomes 999000.

Also, the "M" for million is not correctly interpreted by SC-Trade-Companion, for Janalite.

I think the OCR should catch the leading character, I do not know how to type it, let's call it "$", this is the stange character in the beginning of all prices on my screenshot. And then the price is the number between "$" and "/UNIT". The last character may be a K or a M for 1000 or 1000000.

Another option is to try to catch the dot "." and take the numbers ahead of the dot multiplied by 1000 or 1000000 if price ends with K or M. You add the 3 (or 6) digits after the dot and then you get the price. And if there is not dot, then you can do as of today.

I hope it helps ! SC-Trade-Companion is really cool, keep up the brillant work !

A fan.

larhape commented 4 months ago

I had a look at RawCommodityListing.java It is difficult to debug without a full java environment, which I do not have. But the RIGHT_PATTERN regular expression (especially the \r) suggests that the input string is composed of the very right column of the screen, so the number of SCU and below, the price. So it might be the case that the screenshot extraction does not go enough on the left. I think this is the role of the LocatedColumn class to make the cut, so this might be the place to look at ?

Only supposition, of course, but if this helps...

EtienneLamoureux commented 1 month ago

The new neural network bundled with https://github.com/EtienneLamoureux/sc-trade-companion/releases/tag/1.0.3 is much more precise and makes much fewer decimal errors.