Closed udnp closed 6 years ago
Here performance improvement means decrease duration of detecting pokemons.
My previous PR #968 has the issue that increases duration of detecting pokemons. So this PR is to improve it.
This is a demo comparing this PR's improvement logic with v5.1.0(prev PR #968), and this PR logic decreases about 35% of duration for detecting pokemons.
This demo is the case of GoIV_JP(my private GoIV build for Japanese), but improvement logic is the same as GoIV origin.
Especially, this performance issue had been caused when calling PokemonNameCorrector#getNicknameGuess()
,
in particular, for the cases #2, #4, #5 and #7 in PokemonNameCorrector#getPossiblePokemon()
.
//1. Check if nickname perfectly matches a pokemon (which means pokemon is probably not renamed)
//2. See if we can get a perfect match with candy name & upgrade cost
//3. check correction for Eevee’s Evolution using it's Pokemon Type
//3.1 Azuril and marill have the same evolution cost, but different types.
//4. maybe the candy upgrade cost was scanned wrong because the candy icon was interpreted as a number (for
// example the black candy is not cleaned by the ocr). Try checking if any in the possible evolutions that
// match without the first character.
//5. get the pokemon with the closest name within the evolution line guessed from the candy (or candy and
// cost calculation).
//6. Check if the found pokemon should be alolan variant or not.
//7. All else failed: make a wild guess based only on closest name match
In PR #968 java.text.Normalizer.normalize()
and other operations for charactors had been called in the for-loop of PokemonNameCorrector#getNicknameGuess()
through StringUtils.normalize()
.
This had caused increase of detecting pokemon duration.
Especially, for-loop for the candy pokemons collection is much affected because this collection size is so long.
Additionally, I've found that PokemonNameCorrector#getNicknameGuess()
continued the for-loop even if the best match pokemon was found.
Because Data.levenshteinDistance()
returns 0
for the best match pokemon,
it should stop and break the for-loop if it finds the best match pokemon like ★ commented line.
if (dist < lowestDist) {
bestMatchPokemon = trypoke;
lowestDist = dist;
if (dist == 0) break; // ★
}
About refactoring.
In PokeInfoCalculator
, the methods to return either normalized text or not normalized text had been mixed.
So as its role, this PR makes PokeInfoCalculator
provide raw(not normalized) poekmon text,
and pokemon text is normalized by the side of calling PokeInfoCalculator
, like PokemonNameCorrector
and InputFraction
.
And this PR makes PokemonNameCorrector
singleton.
Lastly, this PR makes SanData
manage scanned text and provide normalized scanned text as its role.
As a result, OcrHelper
class gets simple a little. 😅
this PR improves performance at the points of followings:
refactoring points: