nkh / P5-App-Asciio

Plain ASCII diagram
https://nkh.github.io/P5-App-Asciio/
53 stars 4 forks source link

Refactoring: transform_elements_to_ascii_array_for_cross_overlay #104

Closed nkh closed 1 year ago

nkh commented 1 year ago

needs speed up

check all functions in Ascii.pm

qindapao commented 1 year ago

@nkh

The one in Ascii.pm is ok, and it is not very sensitive to speed.

The efficiency of this in cross mode is really a problem.

nkh commented 1 year ago

@qindapao

I have no problem with speed in the cross mode, either I don't have a large enough diagram ( please send me the largest you have) in crossing mode or it's windows related.

In any case we want it to be fast in windows so I spend a few hours optimizing your code

recomputing the ascii array everytime

There's little we can do about that, Asciio could be more "intelligent" and track changes but that would be a large change that will have to wait (a long time); I will time the code when I have time, it's simple so it should run fast, all the functions are coded in C below the surface.

Still the ascii array computation was sub optimal because of the way the code was ordered and because we keep a list of superimposed characters forcing us to deal with arrays. There's no need for that, I changed it to track only the top character. I may have missed something but it worked fine in my tests, should that be a problem we can always put back the old code.

More on that below.

calling unicode_length for every character

That's a few expensive regexp, I cached it so it the regexp is only called once per unique character

expensive creation of cache key

Because I changed the returned ascii array to point to single characters I could optimize the key computation

calling a long series of normal_char_func

You cache the result so it's a one time cost but it' still a lot of functions

each normal_char_func scans multiple times multiple arrays

for example this one:

sub scene_cross
{
my ($up, $down, $left, $right, $index) = @_;

return 0 unless defined $up && defined $down && defined $left && defined $right ;

return ((any {$_ eq '|'} @{$up}) || (any {$_ eq '.'} @{$up}) || (any {$_ eq '\''} @{$up}) || (any {$_ eq '+'} @{$up}) || (any {$_ eq '^'} @{$up}))
    && ((any {$_ eq '|'} @{$down}) || (any {$_ eq '.'} @{$down}) || (any {$_ eq '\''} @{$down}) || (any {$_ eq '+'} @{$down}) || (any {$_ eq 'v'} @{$down}))
    && ((any {$_ eq '-'} @{$left}) || (any {$_ eq '.'} @{$left}) || (any {$_ eq '\''} @{$left}) || (any {$_ eq '+'} @{$left}) || (any {$_ eq '<'} @{$left}))
    && ((any {$_ eq '-'} @{$right}) || (any {$_ eq '.'} @{$right}) || (any {$_ eq '\''} @{$right}) || (any {$_ eq '+'} @{$right}) || (any {$_ eq '>'} @{$right})) ;

}

I didn't change any of those function but because I reduced the number of characters in the each cell, I have effectively optimized them for the case where many characters overlap.

Still the function can be rewritten for a single character instead (less code and also more effective for the interpreter)

I've left a replacement example in the code as a comment. https://github.com/nkh/P5-App-Asciio/blob/62cc570d8a1d96de17fc04ab05b9d733d35dfb6b/lib/App/Asciio/Cross.pm#L256-L283

All the functions have to be changed as they are all called from the same place and their input has changed; if the changes I made are not enough, you can try that.

Please leave the CROSS_MODE at 0 in the object and change it in your config only.

qindapao commented 1 year ago

@nkh

Thank you for your optimization, but there is a problem, Recording only the topmost characters is fine in general, but there is a special case. I mentioned it before.I also wrote about this special case in the cross algorithm.

In my original implementation, I did only record the topmost characters, so I didn't use an array. It was later changed to use an array to record the characters of each point. It is true that the performance has decreased due to the use of arrays, but I have not found a good way.

cross_error

You can see the code you modified, where the place where the plus sign should have been is now a dot, When there are also intersections next to the intersection, it is necessary to consider the direction of the foreground character and the background character at the same time.

If it is an cross point, there will be two elements recorded in the array, including the foreground character and the background character. If it is a normal point, there will only be one character in the array, which is the top character.

At the beginning, I didn't use the any function. It was because of this special situation that I used the any function.

Caching of unicode_length is great, I noticed it here too.As long as it does not take up too much memory, there is no problem, otherwise we may need to consider the release of the cache later. But the string cache is probably not too large, right?

nkh commented 1 year ago

@qindapao hi,

I don't think the cache size will matter.

How many background characters need to be considered? one or more?

qindapao commented 1 year ago

@nkh

The character below the top-level character.There is either one character or two in the array.

nkh commented 1 year ago

Do you need the background element at position X or at all the positions?

screenshot_2023-07-16_16-36-55

nkh commented 1 year ago

screenshot_2023-07-16_16-39-41

I forgot we can make those look good now ;)

qindapao commented 1 year ago

@nkh

I need a background character at the cross point. I also said in the algorithm document the definition of cross point, and coverage should be excluded.X here is not an cross point, so it is not needed.

nkh commented 1 year ago

I'll read the doc again and think about it.

As it is now, do you feel a speed difference?

qindapao commented 1 year ago

@nkh

It seems a little fast, but I haven't done enough testing. Unicode_length caching is definitely effective.

qindapao commented 1 year ago

@nkh

The other is the calculation of hash key. array splitting, character splicing, calculation is too complicated.

nkh commented 1 year ago

@qindapao

current code:

The only thing left is the array generation:

the array generation should be a fast even with splitting, I see no splicing, those are basic operations and are fast.

It's probably possible to cache the computation.

I'll put back the arrays (we can always think about those later) and time things to see where time is spent.

nkh commented 1 year ago

@qindapao there are one or two other issues you can look at and I will take care of more optimization here, if possible.

qindapao commented 1 year ago

@nkh

Well, thank you for your efforts.

I'll take time to look at other issues that I need to deal with.

qindapao commented 1 year ago

@nkh

I have already done the cache calculation results.Isn't that enough?

The results of character combinations in each direction are cached. There is only one expense.

nkh commented 1 year ago

@qindapao I'm looking into the array generation not the crossing computation.

qindapao commented 1 year ago

@nkh

markup substitution There is almost no impact after closing, so it should not be considered.

qindapao commented 1 year ago

@nkh

If it is intelligently generated, it may mean a big change.

nkh commented 1 year ago

I don't think it's the problem, I had a box with 7 lines for a total of 14 crossings, The crossing code, including array building take 0.01 second per screen update. I'm sure we can do better but it seems to me that the problem you have is somewhere else, it's very smooth on my linux machine.

Maybe you can have a run in the profiler too.

cross_profiling.zip

qindapao commented 1 year ago

@nkh

https://github.com/nkh/P5-App-Asciio/commit/7ae72ec51a1b9fd8d62bda3d2ff702e7a51dabc7

Regarding the performance issue, it is enough for me at present. I don't struggle anymore. Can we end this discussion?

I also don't have a particularly large graph, and I think the current optimization is enough. If you agree, let's end this discussion, too much time has been used.

Also there are two minor issues with your fallback code, which I have fixed. I have some features in my previous code, maybe you didn’t pay attention to it when you rolled back, and you lost it.

Mainly to fix these two small problems:

cross_error_1

  1. Pay attention to this line, if a point is removed from the array because the following characters do not meet the requirements, then the cross point coordinate array also needs to be removed, which is why I used a hash before. However, before returning, delete the points that do not have two elements in the array.

cross_error_2

  1. If the current foreground characters already meet the requirements, then there is no need to generate cover characters.
nkh commented 1 year ago

@qindapao I see that you've noticed that I reset the array computation so it keeps all characters. I'm happy with the crossing mode for now. Let's shelf it for a while.

When I have access to a windows machine I will test to see where it is slower.

We can close this issue and concentrate on the few other refactoring that are left.

merged in 3df8674a5d6ff29acb7f8e663a2d58c03a8e0639

qindapao commented 1 year ago

@nkh

okay, thank you. Let's focus on solving other refactorings.