veiset / poe-vendor-string

Path of Exile Vendor Search tool
59 stars 16 forks source link

Feature suggestion regex for open prefix/suffix on any item #60

Open GreLeBr opened 1 year ago

GreLeBr commented 1 year ago

First of all thank you for making this website available. Seeing the gwennen regex website I had thought how cool it would be to make on for anything in Poe but I would never have been able to make such a nice tool.

One of the reason I wanted to make it is to roll heist items.

Open prefix is ^\s*Name_of_item Open suffix is$\s*Name_of_item

Every league I manually roll Choreography on cloacks and +1 to all jobs on each tools , the regex are not long to write but annoying. Bonus points when your website already calculate possibilities to squeeze as many possible regex for certain mods with character limit in mind.

Sorry I see someone made the same demand for flask months ago so maybe you are not keen on doing it.

veiset commented 1 year ago

Thanks for the suggestion!

Ah, the request for flasks is something I want to do, just not done due to time constraints.

Is the ability to roll cloaks with specific needs for heist what you need? Is the open prefix/suffix part important, or is what is needed a way to roll heist gear in general? (like hitting that +1 all jobs).

GreLeBr commented 1 year ago

So the idea on the Cloaks is to roll : of Choreography | 83 | 80(4–5)% chance to not Activate Lockdown in Grand Heists And since it is a suffix 80 weighted mod I usually highlight open suffix to augment and reduce the number of alts I need.

Similarly all the heist tools have : Skillful | 82 | 100+1 to Level of all Jobs for Heists and that one is a prefix so I highlight opened prefix.

But I am usually too lazy to write the possible best prefix/suffix associated mods so if it was possible to choose which ones you want to add it will be great.
I have not looked at your code yet, I don't know if you manually wrote them or you just added all mods/item texts and coded a way to find shortest regex depending on other words requested. But if it is manually done then that is a lot of work obviously.

veiset commented 1 year ago

Nice. I think I better understand the use-case. This would fit well under the heist page, I will work on this when I have time. But it am prioritizing improving the flask page first, but I'll get to this feature.

I have a regex generator, so a lot of it is automated, luckily :D

GreLeBr commented 1 year ago

Out of curiosity, can I ask you what regex generator you are using. One thing I was curious about to check in the gwennen regex generator and now yours (but I never did), is what kind of algorithm people use to find the shortest possible regex from a list of strings unique to the other possible regex in the others.
I imagine it must be some kind of classical algorithm question to be optimized.
I would imagine that maybe you start at a fixed size of 2 or 3 and when it fails you go up a letter but maybe there is something a lot more clever people figured out long ago.
In your code I see the regex are hardcoded, unless I miss the file where the method is shown.

veiset commented 1 year ago

Its not very optimized in my case, but it runs fast enough (it uses around 1s for around 2000-5000 inputs), but it needs to be pre-processed, so thats why I generate all the regex files upfront.

The code (written in Kotlin) is the core part of the generator:

fun shortestUnique(query: String, modifiers: List<String>): String {
    val cleanQuery = query.cleanInput()
    val mods = modifiers.filter { it != query }.joinToString(" ")
        .cleanInput()

    val resultWithSpace = subStrings(cleanQuery).firstOrNull {
        !mods.contains(it)
                && !it.contains(Regex("[|\\n\\\\+]"))
                && !it.endsWith(" ")
    }
    val resultWithoutSpace = subStrings(cleanQuery).firstOrNull {
        !mods.contains(it)
                && !it.contains(" ")
                && !it.contains("-")
                && !it.contains("|")
                && !it.contains("\\")
                && !it.contains("+")
                && !it.endsWith(" ")
    }
    val spaceLength = resultWithSpace?.length ?: 1000
    val nospaceLength = resultWithoutSpace?.length ?: 1000
    val regexResult = if (nospaceLength <= spaceLength + 2) resultWithoutSpace else "\"$resultWithSpace\""
    val finalResult = regexResult ?: query
    val finalWithoutNumbers = finalResult.replace("#", "\\\\d+")
    return if (finalWithoutNumbers.contains(" ") && !finalWithoutNumbers.startsWith("\"")) {
        "\"$finalWithoutNumbers\""
     } else { 
         finalWithoutNumbers
     }
}

fun subStrings(str: String, n: Int = str.length): List<String> {
    val substrings = mutableListOf<String>()
    for (i in 0 until n) for (j in i + 1..n) {
        substrings.add(str.substring(i, j))
    }
    substrings.sortBy { it.length }
    return substrings
}

If performance had been an issue I could have made the function subStrings lazy and just take until I find a match, now I generate all the possible strings and just sorts it and take the first that doesn't match anything else! 👍

veiset commented 1 year ago

Just to clarify on the reason why this is generated and not done at client:

The generating will take a long time regardless on how well the shortest unique part of the algorithm is running. The thing that takes the most time is fetching all the input the shortest string algorithm needs. For gwennen you need a list of all the unique items in the game, as well as real time data from poe.ninja, this would take a lot of time to do when you enter the page, so all this is prepossessed, generated and cached before the client enters the page.

GreLeBr commented 1 year ago

Thanks for the code, I don't know Kotlin but I can guess more or less what is happening. And I totally understand you are having the regex already calculated to avoid unnecessary longer load times, I guess I thought that maybe it is possible to squeeze one or two more items if done what the user wants to filter (I think the gwennen regex was doing that?).

Otherwise I did not see the page for flask but yeah it will be the same idea, besides adding a possibility to lock an open prefix or suffix.