Clipboard decoding service in javascript

pete-wn commented 8 years ago

Nearly two years ago I wrote some perl code to parse the text that is sent to the clipboard in-game when you press ^c on an item for use with the price report macro. This has sorely needed updating due to changes the layout of the clipboard data, but I never finished it.

Originally there was a case to rewrite this in perl and I did some initial work on it: https://github.com/trackpete/exiletools-indexer/issues/66

However, I've decided to abandon that in favor of writing a new re-coding engine in nodejs, specifically so it is more accessible to other tool builds if people want to contribute to improvements/maintenance in the future.

Similar to the original case, the system will work on a few basic principles:

Identify core attributes and stats from an item's clipboard data
Create a JSON formatted document with this information that mimics the Elasticsearch PWX JSON format from the Indexer

I will then need to add another system that takes the JSON document and converts it into a query document.

This system is currently being developed under related/text-to-json as a node package.

pete-wn commented 8 years ago

First pass tonight includes the basics of:

express server to serve a basic form page and accept POST data from this page (the post should include the clipboard data directly from game)
frisby+jasmine unit tests

Features available as of now:

identify item rarity and item names, throw errors if unable to
identify Prophecy items
identify Divination Cards
identify Maps, extract map properties
rudimentary initial POC explicit mod extraction
unit testing for a couple prophecies, a couple div cards, and a normal map

Examples:

results for a grotto map:

divination card:

rare map with mods, initial extraction:

pete-wn commented 8 years ago

I've added support for detecting Vaal Fragments and Map Fragments in the latest commit.

pete-wn commented 8 years ago

Currency support is now added:

pete-wn commented 8 years ago

Added Gem type detection
Gem detection includes tagging of Quality and Level based properties only
Moved properties detection into a function
Added global Corrupted detection
Added tests for a few different level / quality gems, including one at (MAX)

pete-wn commented 8 years ago

Added baseItemType detection based on item name for rares and uniques
Added initial properties parsing, including extraction for Armour and Weapons
Added initial physical and elemental damage properties extraction for Weapons

pete-wn commented 8 years ago

Added generalized socket information detection:

pete-wn commented 8 years ago

Added more detailed implicit and explicit mod parsing:

Also:

remove notes from infoArray, don't need 'em
remove flavor text from unique items

pete-wn commented 8 years ago

Added modsTotal support for numeric mods:

pete-wn commented 8 years ago

Added a spec test for Ungil's Gauche which verifies:

ele damage
phys damage
critical strike / other weapon propers
implicit/explicit mods
sockets
modsTotals

...
Frisby Test: [IDENTIFY] Ungil's Gauche Boot Knife - 12 ms

        [ POST http://localhost:9000 ] - 12 ms

Finished in 0.278 seconds
15 tests, 118 assertions, 0 failures, 0 skipped```

pete-wn commented 8 years ago

Okay, so, let me think this through. Remaining things to complete on the clipboard conversion to pwx JSON include:

Normal and Magic Items
Change various Unique/Rare sections (such as properties, mod detection, etc.) into functions so they can be called for detection in any rarity type
Flasks, at least at the properties/mods levels, are formatted differently and this needs to be looked into.
modsPseudo at some point might need to be populated. This requires conversion of the perl code into javascript and I'm not 100% sure if it's needed for this service.
Remove debug code
Better console logging

Once those are complete, we'll need to look at taking the document and turning it into appropriate elasticsearch queries.

JonKrone commented 8 years ago

Hey @trackpete!

I was really impressed with all the POE code you've got out there and wanted to dive in so I'm currently working on decomposing much of the unique and rare sections to later implement normal and magic item parsing. I should have asked but I hope your last comment is roughly the current stage of work on this issue.

I notice you've got no gitignore or other files I associate with standard repo flows so how would you prefer contributions? I can submit a pull request with the branch work for feedback later tonight.

pete-wn commented 8 years ago

Hey Jon! The latest code is indeed checked into the repo - I randomly get busy and don't have time to sit down and bang on code for weeks at a time. That last comment also stands for where I was looking to go with the project, and I would absolutely appreciate and accept contributes! A big reason why I started moving to node/javascript was to make the code more accessible since so few people use perl anymore.

My repo setup is a bit clumsy because 99.9% of the time I'm the only person working on it, but pull requests should work! Thanks a ton for checking into this. :)

JonKrone commented 8 years ago

Sweet! The switch definitely helped drive me to contribute. The perl files are fun to read as I can understand generally what's going on but I was really glad when I saw you using JS on this subproject because it is something I can immediately help with.

I put up PR #155 addressing part 2 from above and will move on to normal/magic items next.

pete-wn commented 8 years ago

Adding some information from reviewing your commits:

Normal items are a little weird because you just have to assume they only have implicit mods. Not a huge deal, but it does require thinking about.
Magic items are tougher for similar reasons - take a magic axe with only one mod, for example. How do you know if the mod is implicit or explicit? There's no indicator in the clipboard text.
The main problem with Normal and Magic items is base type detection, however. Again, easier with Normal items - the base type is line 2 of section 1:

Rarity: Normal
Karui Chopper

With rare and unique items, the base type is line 3 of section 1:

Rarity: Rare
Skull Hide
Sun Leather

Rarity: Unique
Maligaro's Virtuosity
Deerskin Gloves

However, with Magic items, for some bizarre reason they don't do this. You end up with a single line name for the item which means you need a lookup table to extract the base types:

Rarity: Magic
Remora's Decimation Bow of Puncturing

Rarity: Magic
Incinerating Amethyst Ring

etc. The way this is handled in the main indexer is by doing a looping regexp comparison against a lookup table (such as the one in data/itemName-to-equipType.json) for word boundary matching, but from a performance perspective that works better as an array/etc... then again, performance is less important when you're just doing one item at a time so who cares if it takes .01s vs .005s.

I do, however, need to make sure I have good reliable extraction of that JSON data.

Also, just wanted to point out, don't worry too much about any conventions I followed. The main goal is just to make sure the output can be directly compared to the indexer formatted data, but I'm still new enough to javascript that I'm sure there are better ways to do everything, so I appreciate any suggestions. I'm going to go ahead and merge this code and see if I can play around with it a bit today, but I'm also working on another project that will get most of my free time for a few days.

JonKrone commented 8 years ago

Thanks for the illustration! I did not see your comment until last week when I reviewed this thread.

The naming irregularities for Magic items vs the simplicity of Normal and Rare+ items are a little troublesome but, as you described, an iterative check through Magic itemType matches consistently finds a match. I am, however, hesitant because testing is relatively light. It is not a constant speed solution but the difference is little in the clipboard decoding use-case.

A shortcut I noticed, and let me know if this is incorrect, is that suffixes always begin with an 'of'. I used this to disregard all suffixes and begin an iterative search for matches at what I presumed to be the end of the real item type.

I have not been able to figure out how to detect an implicit vs explicit mod on singly-modded Magic items! It has been bugging me but I can not tell if there is a method for detecting it at all. Have you had this problem before or thought of a possible technique?

submitted PR #158

p.s. I apologize for disappearing for almost an entire month. This work was very simple and possibly held you up, I should have completed it before disengaging.

JonKrone commented 8 years ago

Ah, I think I was wrong about magic item mods. When an item is Magic but has only one mod, it must necessarily be an explicit mod because the fewest mods a magic item with an implicit can have is two. Any fewer and it would be Normal. Our current processing is able to handle this because Magic items with one mod are treated as explicit and when more than one mod is present, the clipboard text allows us to distinguish which are which.

When you have the time, could you describe the results from parsing a Flask? In particular:

Do we maintain any properties of Flasks?
Do we consider the effects of special flasks to be implicit mods?

pete-wn / exiletools-indexer

Clipboard decoding service in javascript #147