Numpy "fancy indexing", conditional indices, iterators

boris-kz / CogAlg

This project is a Computer Vision implementation of general hierarchical pattern discovery principles introduced in README

http://www.cognitivealgorithm.info

MIT License

91 stars 41 forks source link

Numpy "fancy indexing", conditional indices, iterators #8

Closed Twenkid closed 4 years ago

Twenkid commented 6 years ago

BTW, do you know about the numpy "fancy indexing" and conditional indices? They are supposed to be more efficient than normal iteration and shorter, too.

p[p>0] = p*1.6

https://www.numpy.org/devdocs/reference/generated/numpy.where.html#numpy.where

numpy.where(condition[, x, y]) Return elements chosen from x or y depending on condition.

"If all the arrays are 1-D, where is equivalent to:

[xv if c else yv for c, xv, yv in zip(condition, x, y)]"

np.where(x < y, x, 10 + y) # both x and 10+y are broadcast array([[10, 0, 0, 0], [10, 11, 1, 1], [10, 11, 12, 2]])

...

putmask(a, mask, values) Changes elements of an array based on conditional and input values.

Examples

x = np.arange(6).reshape(2, 3) np.putmask(x, x>2, x**2) x array([[ 0, 1, 2], [ 9, 16, 25]]) If values is smaller than a it is repeated:

x = np.arange(5) np.putmask(x, x>1, [-33, -44]) x array([ 0, 1, -33, -44, -33])

...

A more complex one which sends functions as parameters:

class numpy.nditer[source] Efficient multi-dimensional iterator object to iterate over arrays.

https://www.numpy.org/devdocs/reference/generated/numpy.nditer.html#numpy.nditer

...

https://www.numpy.org/devdocs/reference/routines.indexing.html

Of course, for more meaningful transformations, some kind of nesting of such "fancy" operations or lambda functions and perhaps different data layout would be required.

E.g. more parallel process of pattern competion in pre-allocated slots in numpy arrays, and then wrapping them up, rather than sequentially one by one and doing allocations of new variables for each call. Where that's algorithmically possible; it may require intermediate steps and transformations.

https://github.com/boris-kz/CogAlg/blob/7ab8bf6e0d293e5695805c5def39d8998ecbfb0c/frame_draft.py#L482

boris-kz commented 6 years ago

On Sun, Sep 2, 2018 at 3:58 AM Todor Arnaudov notifications@github.com wrote:

BTW, do you know about the numpy "fancy indexing" and conditional indices? They are supposed to be more efficient than normal iteration and shorter, too...

Maybe, but I don't think it makes much difference in speed, at this point consistent notation and easy debugging are much more important. Long term, I may switch to Julia: https://docs.julialang.org/en/v1/ , but I don't think it has good debugging tools yet.

For example, there is a bug in frame_dblobs, scanP function: lines 148 and 155 are always false. Basically, "root" and "fork" length is always > 0, even though they are renamed "root" and "fork", which do have length == 0 instances. So, they probably get appended somewhere during transfers, for no good reason. I figured the way to track individual arrays would be to put a breakpoint on their object ID, but don't know how to do it.

But this is not a top priority either, I am still redefining core comparison by division. It seems to have at least three incrementally refining steps:

1: max / min -> ratio (div miss), new clean match = min / ratio: this is intuitively more accurate than basic clean match = min - |diff|. 2: max - ratio -> div match, clean match = div match / ratio: this represents additional magnitude compression but should also have much higher "ave" filter. 3: min multiple (integer part of ratio) -> minimally cleaned match, down-shifted (up-shifted fraction multiple) -> clean miss, (min * multiple) / clean miss -> refined clean match

Also, I am editing my comparison to CNN, their edge detection kernels are more like lateral rather than vertical comparison.

Twenkid commented 6 years ago

On Mon, Sep 3, 2018 at 4:12 AM Boris Kazachenko notifications@github.com wrote:

On Sun, Sep 2, 2018 at 3:58 AM Todor Arnaudov notifications@github.com wrote:

BTW, do you know about the numpy "fancy indexing" and conditional indices? They are supposed to be more efficient than normal iteration and shorter, too...

Maybe, but I don't think it makes much difference in speed, at this point consistent notation and easy debugging are much more important.

I agree on the consistent notaton, as of the debugging, but it's yet theoretical.

However for practical purposes and real debugging the gain may be significant, say a simulation running for an hour and using 10 GB RAM instead of 4 or 8 hours and 60 GB (hardly runnable on a PC in RAM). We can't make a precise estimate now., but the improvement could be even more and huge with a vectorization transformation (of the existing code to something else) and if the IFs and the generation of instances can be done in batch mode.

Long term, I may switch to Julia: https://docs.julialang.org/en/v1/ , but I

don't think it has good debugging tools yet.

Interesting.

For example, there is a bug in frame_dblobs, scanP function: lines 148

and 155 are always false. Basically, "root" and "fork" length is always

0, even though they are renamed "root" and "fork", which do have length == 0 instances. So, they probably get appended somewhere during transfers, for no good reason. I figured the way to track individual arrays would be to put a breakpoint on their object ID, but don't know how to do it.

I'll take a look, I planned to give a try to run the existing 2D code and see what the patterns look like, but I haven't done yet.

But this is not a top priority either, I am still redefining core

comparison by division. It seems to have at least three incrementally refining steps:

1: max / min -> ratio (div miss), new clean match = min / ratio: this is intuitively more accurate than basic clean match = min - |diff|.

Because abs(diff) loses the sign?

2: max - ratio -> div match, clean match = div match / ratio: this represents additional magnitude compression

Isn't ratio supposed to be of a small magnitude on average? I.e. in most cases the difference of max and max-ratio would be "small" (200,100 --> 200,198 ?), except when there are huge differences like 255,1 but it's then going to 0?

but should also have much higher "ave" filter.

Higher magnitude or higher level?

3: min * multiple (integer part of ratio) -> minimally cleaned match,

down-shifted (up-shifted fraction multiple) -> clean miss, (min multiple) / clean miss -> refined clean match

By the way, since you're using min/max expressions. The demoscene/shader-art/procedural CGI often uses sequences of nested min and max, as well as their shorthand "clamp": https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/clamp.xhtml

clamp returns the value of x constrained to the range minVal to maxVal. The returned value is computed as min https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/min.xhtml(max https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/max.xhtml(x, minVal), maxVal).

The sequence of min(max(min... with a proper input and formulas allow to substitute conditional operations with pure calculation - when a value of the function is below or above, it's just ignored, until it gets active for its range etc.

IFs used to be far more expensive on the GPU (on the CPU as well), the GPU had to compute all the branches, never mind the actual choice. Now GPUs are better and the difference may be smaller, however that's still elegant and is numerical.

I suspect that similar ideas may be useful, but for now I can't be specific.

Also, I am editing my comparison to CNN, their edge detection kernels are

more like lateral rather than vertical comparison.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/boris-kz/CogAlg/issues/8#issuecomment-417975031, or mute the thread https://github.com/notifications/unsubscribe-auth/AWSP2ICXfP87IKr75C9C_hbp6i7pd2lLks5uXIH_gaJpZM4WWgpX .

boris-kz commented 6 years ago

I agree on the consistent notaton, as of the debugging, but it's yet theoretical.

Most of debugging is "theoretical", you have to understand what the algorithm is supposed to do.

However for practical purposes and real debugging the gain may be significant, say a simulation running for an hour and using 10 GB RAM instead of 4 or 8 hours and 60 GB (hardly runnable on a PC in RAM).

This is not a problem near term. And then there is AWS, etc.

We can't make a precise estimate now., but the improvement could be even more and huge with a vectorization transformation (of the existing code to something else) and if the IFs and the generation of instances can be done in batch mode.

Julia is supposed to do automatic vectorization too, but it probably won't help my algorithm very much.

For example, there is a bug in frame_dblobs, scanP function: lines 148 and 155 are always false. Basically, "root" and "fork" length is always > 0, even though they are renamed "root" and "fork", which do have length == 0 instances. So, they probably get appended somewhere during transfers, for no good reason. I figured the way to track individual arrays would be to put a breakpoint on their object ID, but don't know how to do it.

I'll take a look, I planned to give a try to run the existing 2D code and see what the patterns look like, but I haven't done yet.

Because of this bug, frame_dblobs only initializes blobs, it doesn't increment or terminate them. So, half of it never actually runs. When debugging, run it line-by-line: breakpoint at line 257: for y in range(1, Y) It will give you out-of-range error at y == 10, not sure why but you don't need to go that far. In PyCharm "Variables" tab, make sure to unfold all variables of "dP" and "vP": the results of processing prior line.

But this is not a top priority either, I am still redefining core comparison by division. It seems to have at least three incrementally refining steps:

1: max / min -> ratio (div miss), new clean match = min / ratio: this is intuitively more accurate than basic clean match = min - |diff|.

Because abs(diff) loses the sign?

No, signed diff is preserved as a separate variable in both cases. But redundancy to dPs should be projected over future inputs, and projection rate depends on their relative value: diff / min.

This is easier to see in equivalent clean: min * (min / max (=min+diff)): match is reduced in proportion to relative difference, but stays >= 0.

On the other hand, min - |diff| can be negative, which only makes sense for a relative match. So, this is really a part of evaluation, along with - ave, while min / ratio is an actual evaluand.

2: max - ratio -> div match, clean match = div match / ratio:

this represents additional magnitude compression

Isn't ratio supposed to be of a small magnitude on average?

Yes, hence the compression. But it seems meaningless by itself because float is two numbers, with two separate magnitudes. So, I may skip it and go directly to #3.

but should also have much higher "ave" filter.

Higher magnitude or higher level?

magnitude, because division has greater opportunity cost.

3: min * multiple (integer part of ratio) -> initial clean match,

integer-represented fraction multiple -> remaining miss, (min multiple) / remaining miss -> refined clean match

By the way, since you're using min/max expressions. The demoscene/shader-art/procedural CGI often uses sequences of nested min and max, as well as their shorthand "clamp": https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/clamp.xhtml

clamp returns the value of x constrained to the range minVal to maxVal. The returned value is computed as min https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/min.xhtml(max https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/max.xhtml(x, minVal), maxVal).

The sequence of min(max(min... with a proper input and formulas allow to substitute conditional operations with pure calculation - when a value of the function is below or above, it's just ignored, until it gets active for its range etc.

I don't see how you could form patterns with matrix computation, without IFs. There is no good way to do exclusive variable-span summation.

IFs used to be far more expensive on the GPU (on the CPU as well), the GPU had to compute all the branches, never mind the actual choice. Now GPUs are better and the difference may be smaller,

GPUs are mostly SIMD and stream-oriented, there isn't enough cache per ALU for localized logic. So, I probably won't use them, my parallelization would have to be more coarse.

however that's still elegant and is numerical.

I call it indiscriminate, which is not smart.

Twenkid commented 6 years ago

A partial answer on code:

For example, there is a bug in frame_dblobs, scanP function: lines 148 and 155 are always false. Basically, "root" and "fork" length is always > 0, even though they are renamed "root" and "fork", which do have length == 0 instances. So, they probably get appended somewhere during transfers, for no good reason. I figured the way to track individual arrays would be to put a breakpoint on their object ID, but don't know how to do it.

Are you sure? From my first examination, it seems that _x is always > ix and if len(fork) was never hit with the first image I fed in (not your racoon) However in another run with another image, the log included both branches.

```
if _x > ix:  # x overlap between _P and next P: _P is buffered for next

scanP, else included in blob: print("if x > ix:") buff.append((_P, _x, fork, root_)) else: print("NOT if _x > ix:") if len(fork) == 1 and fork[0][0]



scan_P_
scan_P_
scan_P_
scan_P_
scan_P_
_fork_=_P_.popleft []
P[0] == _P[0]?
FORK:  [[]]
_FORK_:  []
if _x > ix:
_fork_=_P_.popleft []
_FORK_:  []
if _x > ix:
scan_P_
3.0561718940734863

The image was 320x240, but the run is suspiciously short, it seems it
depends on the input.

When I run it with another image, a big one 1600x1200 and different (color,
but black-white, ... - I'll do it more systematically later), the output
was huge, and then it hit this IF:
http://twenkid.com/cog/fork.txt

Also, I think the loading of the image is wrong.
How did it run before that, cv2 complained.

I think astype is not needed and also it's better to convert it to single
channel, because usually it won't be BW.

cv2.imshow("raw", image)
print(image.shape)
Y, X, Channels = image.shape  # image height and width
# or Y, X, _ = image.shape    # in case you don't care about the value of
channels
# as it was Y,X = ... cv2 complained

Then:

if Channels > 1:
  image = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

Or if you wish not to use their conversion, you could split the channels
and do other calculation.

Such as

b , g , r = cv2.split(image)

As of astype, the input would be float only from some special images,
normal ones are uint8:

path = r"D:\Capture\capture_106_15042018_025159.jpg"
image = cv2.imread(path)
image2 = image.copy()
print(image2.shape)
print(image2)
print(type(image))
print(type(image2))
print(image.dtype)
print(image2.dtype)
...

<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
uint8
uint8
(1200, 1600, 3)

—
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <https://github.com/boris-kz/CogAlg/issues/8#issuecomment-418216052>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AWSP2JIHHgKpAbWeTzlDIHRdgfoZYsE2ks5uXdihgaJpZM4WWgpX>
> .
>

Twenkid commented 6 years ago

A longer log with the second image:

http://twenkid.com/cog/fork.txt

boris-kz commented 6 years ago

On Thu, Sep 6, 2018 at 5:58 AM Todor Arnaudov notifications@github.com wrote:

A partial answer on code:

For example, there is a bug in frame_dblobs, scanP function: lines 148 and 155 are always false. Basically, "root" and "fork" length is always > 0, even though they are renamed "root" and "fork", which do have length == 0 instances. So, they probably get appended somewhere during transfers, for no good reason. I figured the way to track individual arrays would be to put a breakpoint on their object ID, but don't know how to do it.

Are you sure? From my first examination, it seems that _x is always > ix and if len(fork) was never hit with the first image I fed in (not your racoon) However in another run with another image, the log included both branches.

Just stick with the racoon for now, it works through line 148 Are you even using PyCharm?

if _x > ix: # x overlap between _P and next P: _P is buffered for next
scan_P_, else included in blob:
print("if _x > ix:")
buff_.append((_P, _x, _fork_, root_))
else:
print("NOT if _x > ix:")
if len(_fork_) == 1 and _fork_[0][0]
scanP scanP scanP scanP scanP fork=P.popleft [] P[0] == _P[0]? FORK: [[]] FORK: [] if _x > ix: fork=P.popleft [] FORK: [] if _x > ix: scanP 3.0561718940734863

The image was 320x240, but the run is suspiciously short, it seems it depends on the input.

When I run it with another image, a big one 1600x1200 and different (color, but black-white, ... - I'll do it more systematically later), the output was huge, and then it hit this IF: http://twenkid.com/cog/fork.txt

There is no sense in running the whole image at point, run it line by line.

Also, I think the loading of the image is wrong. How did it run before that, cv2 complained.

I think PyCharm is having some issues interfacing with cv2, but it doesn't matter because the image still loads. You don't even need cv2, you can load the same image online, just uncomment lines 280-282

I think astype is not needed and also it's better to convert it to single channel, because usually it won't be BW.

Primary input will always be BW, I will use color / BW as sub-inputs.

cv2.imshow("raw", image) print(image.shape) Y, X, Channels = image.shape # image height and width

or Y, X, _ = image.shape # in case you don't care about the value of

channels

as it was Y,X = ... cv2 complained

But it still works, their values are correct.

Then: if Channels > 1: image = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

How is it different from specifying 0 in: image = cv2.imread(arguments['image'], 0).astype(int)

As of astype, the input would be float only from some special images, normal ones are uint8:

That's the problem, the integers must be signed. It's not to convert floats.

Twenkid commented 6 years ago

Sure I'm using PyCharm, but text is more compressed than screenshots. Of course I run it line by line also, but based on my 20+ years? longer experience than you, I know that it's useful to generate logs with the specific details wanted and to analyze them offline too, besides watching the code online (with all the clutter of the IDE).

i will run what I want and as I want.

That's the problem, the integers must be signed. It's not to convert floats.

I know it's not to convert floats, I said that "astype" is usually applied for floats, not for ints, because what you do so far is not changing anything - it's still uint8, as shown in the printout.

It should be:

import numpy as np

...

image = cv2.imread(path, 0).astype(np.int) #32 bit signed or image = cv2.imread(path, 0).astype(np.int16) #16 bit

np.int, not just int https://docs.scipy.org/doc/numpy-1.13.0/user/basics.types.html ...

Then: if Channels > 1: image = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

How is it different from specifying 0 in: image = cv2.imread(arguments['image'], 0).astype(int)

Ah, this is a special parameter - I use openCV but haven't needed this "magic number". The explicit conversion is more clear and readable, also image.shape returns 3 elements in case of color and 2 in case of BW (bad API design, it had to return 0, not nothing).

This detail has to be marked in comments to avoid these confusions.

If you insist on always one hard coded image, you don't need to parse arguments.

capture_223_06092018_121207

boris-kz commented 6 years ago

You just broke the code, Todor, for no reason at all. The image doesn't load. Even if it did, all pixels would be unsigned, so differences from basic subtraction in comp would all be positive. You should know better, with your 20 years of experience.

On Thu, Sep 6, 2018 at 3:15 PM Todor Arnaudov notifications@github.com wrote:

Sure I'm using PyCharm, but text is more compressed than screenshots. Of course I run it line by line also, but based on my 20+ years? longer experience than you, I know that it's useful to generate logs with the specific details wanted and to analyze them offline too, besides watching the code online (with all the clutter of the IDE).

i will run what I want and as I want.

That's the problem, the integers must be signed. It's not to convert floats.

I know it's not to convert floats, I said that "astype" is usually applied for floats, not for ints, because what you do so far is not changing anything - it's still uint8, as shown in the printout.

It should be:

import numpy as np

...

image = cv2.imread(path, 0).astype(np.int) #32 bit signed or image = cv2.imread(path, 0).astype(np.int16) #16 bit

np.int, not just int https://docs.scipy.org/doc/numpy-1.13.0/user/basics.types.html ...

Then: if Channels > 1: image = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

How is it different from specifying 0 in: image = cv2.imread(arguments['image'], 0).astype(int)

Ah, this is a special parameter - I use openCV but haven't needed this "magic number". The explicit conversion is more clear and readable, also image.shape returns 3 elements in case of color and 2 in case of BW (bad API design, it had to return 0, not nothing).

This detail has to be marked in comments to avoid these confusions.

If you insist on always one hard coded image, you don't need to parse arguments.

[image: capture_223_06092018_121207] https://user-images.githubusercontent.com/23367640/45179713-f5a9a200-b221-11e8-9912-4db0239268eb.jpg

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/boris-kz/CogAlg/issues/8#issuecomment-419209628, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAXGamPcxJGvxRFMZ4ZDfbxcfxpaRLEks5uYXRAgaJpZM4WWgpX .

Twenkid commented 6 years ago

This is how you understand things. By exploring and testing, not by following instructions.

Twenkid commented 6 years ago

Not 20 years, 20 years more than you.

Twenkid commented 6 years ago

On topic, the Because of this bug, frame_dblobs only initializes blobs, it doesn't increment or terminate them. So, half of it never actually runs. When debugging, run it line-by-line: breakpoint at line 257: for y in range(1, Y) It will give you out-of-range error at y == 10, not sure why but you don't need to go that far. In PyCharm "Variables" tab, make sure to unfold all variables of "dP" and "vP": the results of processing prior line

The shape of_fork_at the point of error is [] ( np.shape(pylist) ) (line numbers here don't match yours and are to show the order of execution within the file:


ln153 P[0] == _P[0]? 
ln156 FORK_:  [[]]
_FORK_:  [[]]
ln160: if _x > ix, _x,ix:, ln160 1065 1063
ln142 while _ix <= x: ln142  1063 1065
scan_P_
ln138 ix **1065** 
ln142 while _ix <= x: ln142  0 1541   #**_ix = 0**

**_ix** = _x - len(_P[6])... ln151  **1063** 1541 2
**_FORK_:  [[]]**
ln160: NOT if _x > ix: _fork_.shape ln163, (1, 0) [[]]

_ix = _x - len(_P[6])... ln151  1063 1541 2
_FORK_:  [[]]
ln160: NOT if _x > ix: _fork_.shape ln163, (1, 0) [[]]   ix = 1065 < _x = 1541

Thus  (if P[0] == _P[0]) is not hit

_fork_ is set in the if _buff_... block, thus there's something wrong with _buff_ or _P_

 D:\Py\charm\ca\CogAlg\fork3.txt (10 hits)
    Line 57: y,Y 274: 1 1200
    Line 58: y,Y 274: 2 1200
    Line 59: y,Y 274: 3 1200
    Line 60: y,Y 274: 4 1200
    Line 61: y,Y 274: 5 1200
    Line 1590: y,Y 274: 6 1200
    Line 3147: y,Y 274: 7 1200
    Line 4676: y,Y 274: 8 1200
    Line 6128: y,Y 274: 9 1200
    Line 7587: y,Y 274: 10 1200

boris-kz commented 6 years ago

Yes, something to do with the transfers from one list to another. And that out-of-range error on line 149 (new). I am getting it at y == 10 if starting with y == 0, or y == 406 if starting with y == 400 That might be why the blobs don't accumulate. There are two lists there: fork and fork[0][0][5] # blob fork and _fork roots Do you see which one is out of range?

On Fri, Sep 7, 2018 at 8:16 AM Todor Arnaudov notifications@github.com wrote:

On topic, the Because of this bug, frame_dblobs only initializes blobs, it doesn't increment or terminate them. So, half of it never actually runs. When debugging, run it line-by-line: breakpoint at line 257: for y in range(1, Y) It will give you out-of-range error at y == 10, not sure why but you don't need to go that far. In PyCharm "Variables" tab, make sure to unfold all variables of "dP" and "vP": the results of processing prior line

The shape of fork at the point of error is [] ( np.shape(pylist) ) (line numbers here don't match yours and are to show the order of execution within the file:

ln153 P[0] == P[0]? ln156 FORK: [[]] FORK: [[]] ln160: if _x > ix, _x,ix:, ln160 1065 1063 ln142 while ix <= x: ln142 1063 1065 scan_P ln138 ix 1065 ln142 while _ix <= x: ln142 0 1541 #_ix = 0

_ix = _x - len(_P[6])... ln151 1063 1541 2 FORK: [[]] ln160: NOT if _x > ix: fork.shape ln163, (1, 0) [[]]

_ix = _x - len(_P[6])... ln151 1063 1541 2 FORK: [[]] ln160: NOT if _x > ix: fork.shape ln163, (1, 0) [[]] ix = 1065 < _x = 1541

Thus (if P[0] == _P[0]) is not hit

fork is set in the if buff... block, thus there's something wrong with buff or P

D:\Py\charm\ca\CogAlg\fork3.txt (10 hits) Line 57: y,Y 274: 1 1200 Line 58: y,Y 274: 2 1200 Line 59: y,Y 274: 3 1200 Line 60: y,Y 274: 4 1200 Line 61: y,Y 274: 5 1200 Line 1590: y,Y 274: 6 1200 Line 3147: y,Y 274: 7 1200 Line 4676: y,Y 274: 8 1200 Line 6128: y,Y 274: 9 1200 Line 7587: y,Y 274: 10 1200

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/boris-kz/CogAlg/issues/8#issuecomment-419421270, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAXGZMDXAzRXa51-LxHrB0tr35o2Hlfks5uYmOGgaJpZM4WWgpX .

boris-kz commented 6 years ago

The shape of fork at the point of error is [] ( np.shape(pylist) ) (line numbers here don't match yours and are to show the order of execution within the file:

What do you mean by error? This is just one instance?

ln153 P[0] == P[0]? ln156 FORK: [[]] FORK: [[]] ln160: if _x > ix, _x,ix:, ln160 1065 1063 ln142 while ix <= x: ln142 1063 1065 scan_P ln138 ix 1065 ln142 while _ix <= x: ln142 0 1541 #_ix = 0

_ix = _x - len(_P[6])... ln151 1063 1541 2 FORK: [[]] ln160: NOT if _x > ix: fork.shape ln163, (1, 0) [[]]

_ix = _x - len(_P[6])... ln151 1063 1541 2 FORK: [[]] ln160: NOT if _x > ix: fork.shape ln163, (1, 0) [[]] ix = 1065 < _x = 1541

Thus (if P[0] == _P[0]) is not hit

When "while ix <= x" terminates? Why is that an error?

Twenkid commented 6 years ago

What do you mean by error? This is just one instance?

The program tries to read from an empty list and it interrupts the program. Well, yes - you could set assertions and avoid that - like checking the shape of the list.

You could also add try-except blocks and catch such cases which you expect (or exceptions in general).

For example they may be collected in a list and presented all together after complete run (if it's possible to finish it with the mistakes) with the context of the error, the other variables, so we could see many cases where the same exception occurs and see better the big picture, together with the coordinates of/on the actual input. We can spot the coordinates visually on the image, such as high contrast compared to the average etc.

At the assertion points, data/patterns could be fed where they were supposed to go with sample values of empty patterns or the previous pattern or - you understand better.

And/or the iteration just could be skipped, however without shutting down the whole program. That way a longer statistics could be collected.

_fork_ and _fork_[0][0][5] # blob _fork_ and _fork roots
Do you see which one is out of range?

Well, the interrupt (error) in my run and log was because_fork_was an empty list ([] and shape is 0), but code is trying to read [0][0][5] - thus both are out of range, the _fork_[0][0][5] doesn't exist, neither _fork_[0] etc.

When "while *ix <= x" terminates? Why is that an error?*

The termination is when it's False and the else is executed (the end of the log above)?

I'll check it and watch it with the context later with an environment, now just typing.

Could you direct me what's the expected answer for "when" - what features/values/lenghts or you just mean the coordinates of the current pixel x,y?

...

BTW, since you compare to filters, I assume it matters what are the average and specific values of the pixels when converted to BW and how they compare to your "magic numbers", thus they may impact the points of the bugs.

Since the patterns are built from the input, the local contrast etc. also guides the process. So in my runs I may add some display.

...

One more thing, since dblobs is a self-containing module:

if __name__ == "__main__":
    main()

The global code would go in that function.

Then when doing import ... no unwanted code will be executed and the governing (or debugging) module would avoid the command line or automatic loading.

PS. As of code breaking - sorry, you're right I'd better not pull in your master branch, but just give suggestions aside. You take or try them if you wish.

boris-kz commented 6 years ago

On Sat, Sep 8, 2018 at 4:55 AM Todor Arnaudov notifications@github.com wrote:

What do you mean by error? This is just one instance?

The program tries to read from an empty list and it interrupts the program. Well, yes - you could set assertions and avoid that - like checking the shape of the list.

You could also add try-except blocks and catch such cases which you expect (or exceptions in general).

For example they may be collected in a list and presented all together after complete run (if it's possible to finish it with the mistakes) with the context of the error, the other variables, so we could see many cases where the same exception occurs and see better the big picture, together with the coordinates on the actual input.

At the assertion points, data/patterns could be fed where it was supposed to go with sample values of empty patterns or the previous pattern or - you understand better.

And/or the iteration just could be skipped, however without shutting off the whole program so a longer statistics could be collected.

fork and fork[0][0][5] # blob fork and _fork roots Do you see which one is out of range?

Well, the interrupt (error) in my run and log was because fork was an empty list ([] and shape is 0), but code is trying to read [0][0][5] -

We are talking about line 153( just updated)?

if len(fork) == 1 and fork[0][0][5] == 1 and y > rng * 2 + 1 + 400 and x < X - 300: # margin set at 300: >len(fork_P[6])?

Why is it trying to read it, this is an AND expression, shouldn't prior NOT len(fork) == 1 stop it?

When "while " terminates? Why is that an error?*

The termination is when it's False and the else is executed (the end of the log above)?

This is a while loop, it should not execute unless _ix <= x

BTW, since you compare to filters, I assume it matters what are the average and specific values of the pixels when converted to BW and how they compare to your "magic numbers", thus they may impact the points of the bugs.

It shouldn't matter, patterns will have different shape, but the code should run regardless. And frame_dblobs doesn't even compute vPs.

Since the patterns are built from the input, the local contrast etc. also guides the process. So in my runs I may add some display.

dPs are local contrast. Actually, they aren't very informative in dblobls because there is no recursion yet, as in line_POC. dblobs is only for technical debugging.

...

One more thing, since dblobs is a self-containing module:

if name == "main": main()

The global code would go in that function.

Then when doing import ... no unwanted code will be executed and the governing (or debugging) module would avoid the command line or automatic loading.

Thanks, this is for running it outside PyCharm? Inside of it, I don't use main() at all.

PS. As of code breaking - sorry, you're right I'd better not pull in your

master branch, but just give suggestions aside. You take or try them if you wish.

Sure. BTW, I just posted some edits in dblobs.

Twenkid commented 6 years ago

When "while " terminates? Why is that an error?*

The termination is when it's False and the else is executed (the end of the log above)? This is a while loop, it should not execute unless _ix <= x

No, I meant the termination of the program due to the error, not of a pattern. :) I don't know about the patterns yet.


 if len(_fork_) == 1 and _fork_[0][0][5] == 1 and y > rng * 2 + 1 + 400 and
x < X - 300:  # margin set at 300: >len(fork_P[6])?

Why is it trying to read it, this is an AND expression, shouldn't prior NOT len(fork) == 1 stop it?

I don't know, but it's not a good style to test the content of a data structure that can be NULL/None.

What about nesting:

if (len_fork_)==1:
  if  _fork_[0][0][5] == 1 and y > rng * 2 + 1 + 400 and x < X - 300:
     ...

Then it will skip the dangerous check.

If you want to check objects that can be None, then it's a good practice to embrace them with try-except block which would catch the exception gracefully.

BTW, since you compare to filters, I assume it matters what are the average and specific values of the pixels when converted to BW and how they compare to your "magic numbers", thus they may impact the points of the bugs.

It shouldn't matter, patterns will have different shape, but the code should run regardless. And frame_dblobs doesn't even compute vPs.

Unless there are bugs...

Thanks, this is for running it outside PyCharm? Inside of it, I don't use main() at all.

Inside or outside, it doesn't matter. That's for convenience, it's a normal function (unlike the C main).

Another module, say "test.py" would import it:

import frame_dblobs.py

Then it would call the functions externally with a data set, store the patterns etc.

If there's no such check, it will first execute the code in the global part of the module when importing it, which won't be desirable.

With the check, the module can be either run stand-alone or imported without that by-effect.

Twenkid commented 6 years ago

The nested if would require other adjustments to keep the logic of the else-s, for example like below:

Also the very long expressions are more readable if broken down. (For other developers and after not working constantly on it.) and that avoids accessing non-existing elements:


isEmpty =  len(_fork_) 
yLimitA = rng * 2 + 1
xLimitA = x < X - 99

yLimitB = y > rng * 2 + 1 + 400 
xLimitB = x < X - 300
forks =  False if isEmpty else _fork_[0][0][5] == 1 

if not isEmpty:
  if forks:
    if yLimitA and xLimitA:
     ...
    elif yLimitB and xLimitB:
      ...

I don't know about you, but when my logic expressions grow too complex, e.g. for parsing, the usage of named variables improve readability and eases debugging.

(Further, the numbers could be upperBorder, lowerBorder etc., left, right etc., could be defined in the beginning of the module and more easily edited:

global upperBorder, leftBorder upperBorder =400; leftBorder = 300

Also they could be edited by an external testing module.

boris-kz commented 6 years ago

Thanks. I tried these things before and am doing it now. But it is silly that AND is not evaluated conditionally by default, it adds a lot of clutter I still don't understand that try-except, but it seems that it will add even more clutter than nested IF?

boris-kz commented 6 years ago

Ok I rearranged 151-161, but 155: if fork[0][0][5] == 1: is still never true: blobs initialize but don't accumulate, and is out of range at y==8

On Sun, Sep 9, 2018 at 9:02 AM Boris Kazachenko boris.kz@gmail.com wrote:

Thanks. I tried these things before and am doing it now. But it is silly that AND is not evaluated conditionally by default, it adds a lot of clutter I still don't understand that try-except, but it seems that it will add even more clutter than nested IF?

Twenkid commented 6 years ago

I agree on the silliness.

Try-except may be generated automatically if they are needed at a large scal.

Your code could be as it is, it will be parsed (simply, just searching for IF ... or particular blocks, not complete analysis).

Debug and logging code will be added in a copy of your code and the copy would be executed.

So you could develop yours without the clutter, the test version will include additional stuff.

That could be assisted with so called annotations in the comments, e.g.:

if blah-blah:   # @{
  ...
                      # @ }

Meaning "surround with try-catch" or whatever needed.

The preprocessor would watch for line numbers etc. and would display them.

Yes, you could observe it online while debugging, but this may collect the wanted specific information and present it as wanted.

...

The exception scheme is (the simplest):

global errors
errors = []

try:
   if ...
except Exception as e: print(e); print("some local data and paramters"); errors.append(e)   

#should add more info about the exception

...

for e in errors: print(e)

Twenkid commented 6 years ago

There you are one report, see below (a long printout). Out of range: 307 times on a run with the raccoon, with the older version (sorry, I'll load the newer ones later, I wanted to test the principle).

One obvious discovery is that the out of range comes near the end of the lines or at the very end (1023). The leftmost is at 976.

Note that the try-except block had to embrace below, because blob is defined in the failing IF block, so if the try-except is only for this one, the following blocks fail with using an undefined variable blob.

So blob etc. have to be set with some default values, you know which ones, or as below - the code below is also skipped when there's an exception.

I don't know if that affects the following lines and the errors.

...

global errors, dbg errors = [] dbg = False

...

def scan_P(...):

    ...

    if _x > ix:  # x overlap between _P and next P: _P is buffered for next scan_P_, else included in blob:
        if (dbg): print("ln160: if _x > ix, _x,ix:, ln160", _x,ix)
        buff_.append((_P, _x, _fork_, root_))
    else:
        if (dbg): print("ln160: NOT if _x > ix: _fork_.shape ln163,", np.shape(_fork_), _fork_)
        try:
          if len(_fork_) == 1 and _fork_[0][0][5] == 1 and y > rng * 2 + 1 and x < X - 99:  # no fork blob if x < X - len(fork_P[6])?
            if (dbg): print("ln166 len(_fork_)==1 and...np.shape ... ln166", np.shape(_fork_), _fork_)

            # if blob _fork_ == 1 and _fork roots == 1, always > 0: a bug probably appends fork_ outside scan_P_?
            blob = form_blob(_fork_[0], _P, _x)  # y-2 _P is packed in y-3 _fork_[0] blob +__fork_
          else:
            ave_x = _x - len(_P[6]) / 2  # average x of P: always integer?
            blob = _P, [_P], ave_x, 0, _fork_, len(root_)  # blob init, Dx = 0, no new _fork_ for continued blob

          if len(root_) == 0:  # never happens, probably due to the same bug
              net = blob, [blob]  # first-level net is initialized with terminated blob, no root_ to rebind
              if len(_fork_) == 0:
                  frame = term_network(net, frame)  # all root-mediated forks terminated, net is packed into frame
              else:
                  net, frame = term_blob(net, _fork_, frame)  # recursive root network termination test
          else:
              while root_:  # no root_ in blob: no rebinding to net at roots == 0
                  root_fork = root_.pop()  # ref to referring fork, verify?
                  root_fork.append(blob)  # fork binding, no convert to tuple: forms a new object?
        except Exception as e:
            print(e); errors.append(e);
            #print("x, P, P_, _buff_, _P_");
            print(str(x));

I tried to print also more local information - P, P_, _buff_, _P_ but it happened to be too much, it's unmanageable at this detail for now.

The time that Python spends to traverse to print is too much. I may try with Pickle some time (binary copies, so maybe less/faster traversals).

Exceptions: 307 list index out of range

(768, 1024) y,Y 274: 1 768 y,Y 274: 2 768 y,Y 274: 3 768 y,Y 274: 4 768 y,Y 274: 5 768 y,Y 274: 6 768 y,Y 274: 7 768 y,Y 274: 8 768 y,Y 274: 9 768 y,Y 274: 10 768 list index out of range

1023 y,Y 274: 11 768 y,Y 274: 12 768 y,Y 274: 13 768 y,Y 274: 14 768 y,Y 274: 15 768 list index out of range

1019 list index out of range

1022 y,Y 274: 16 768 list index out of range

1016 list index out of range

1022 list index out of range

1023 y,Y 274: 17 768 y,Y 274: 18 768 y,Y 274: 19 768 y,Y 274: 20 768 y,Y 274: 21 768 list index out of range

987 list index out of range

987 y,Y 274: 22 768 list index out of range

987 list index out of range

990 list index out of range

1010 y,Y 274: 23 768 y,Y 274: 24 768 list index out of range

997 list index out of range

1017 y,Y 274: 25 768 list index out of range

1008 y,Y 274: 26 768 y,Y 274: 27 768 y,Y 274: 28 768 list index out of range

1020 y,Y 274: 29 768 y,Y 274: 30 768 y,Y 274: 31 768 y,Y 274: 32 768 y,Y 274: 33 768 y,Y 274: 34 768 y,Y 274: 35 768 y,Y 274: 36 768 y,Y 274: 37 768 list index out of range

1008 list index out of range

1012 y,Y 274: 38 768 list index out of range

1019 y,Y 274: 39 768 y,Y 274: 40 768 y,Y 274: 41 768 y,Y 274: 42 768 y,Y 274: 43 768 y,Y 274: 44 768 list index out of range

1020 y,Y 274: 45 768 list index out of range

1019 y,Y 274: 46 768 y,Y 274: 47 768 y,Y 274: 48 768 y,Y 274: 49 768 list index out of range

1017 y,Y 274: 50 768 y,Y 274: 51 768 y,Y 274: 52 768 y,Y 274: 53 768 list index out of range

1000 list index out of range

1017 y,Y 274: 54 768 list index out of range

1005 list index out of range

1010 y,Y 274: 55 768 list index out of range

1001 list index out of range

1017 list index out of range

1017 y,Y 274: 56 768 y,Y 274: 57 768 list index out of range

1012 y,Y 274: 58 768 list index out of range

1021 y,Y 274: 59 768 y,Y 274: 60 768 y,Y 274: 61 768 y,Y 274: 62 768 y,Y 274: 63 768 list index out of range

1022 y,Y 274: 64 768 y,Y 274: 65 768 y,Y 274: 66 768 y,Y 274: 67 768 y,Y 274: 68 768 list index out of range

1023 y,Y 274: 69 768 y,Y 274: 70 768 y,Y 274: 71 768 y,Y 274: 72 768 list index out of range

1021 y,Y 274: 73 768 y,Y 274: 74 768 list index out of range

1021 y,Y 274: 75 768 y,Y 274: 76 768 list index out of range

1017 y,Y 274: 77 768 y,Y 274: 78 768 y,Y 274: 79 768 y,Y 274: 80 768 y,Y 274: 81 768 y,Y 274: 82 768 y,Y 274: 83 768 y,Y 274: 84 768 list index out of range

995 list index out of range

1010 list index out of range

1018 list index out of range

1018 y,Y 274: 85 768 y,Y 274: 86 768 y,Y 274: 87 768 y,Y 274: 88 768 list index out of range

1008 list index out of range

1011 y,Y 274: 89 768 list index out of range

1014 y,Y 274: 90 768 list index out of range

1019 y,Y 274: 91 768 y,Y 274: 92 768 list index out of range

1023 y,Y 274: 93 768 list index out of range

1022 y,Y 274: 94 768 y,Y 274: 95 768 y,Y 274: 96 768 y,Y 274: 97 768 y,Y 274: 98 768 list index out of range

1015 list index out of range

1015 y,Y 274: 99 768 list index out of range

1010 list index out of range

1013 y,Y 274: 100 768 y,Y 274: 101 768 list index out of range

1014 y,Y 274: 102 768 list index out of range

1013 y,Y 274: 103 768 list index out of range

1012 y,Y 274: 104 768 y,Y 274: 105 768 list index out of range

1007 y,Y 274: 106 768 y,Y 274: 107 768 y,Y 274: 108 768 y,Y 274: 109 768 y,Y 274: 110 768 list index out of range

1000 list index out of range

1018 y,Y 274: 111 768 list index out of range

1017 list index out of range

1017 y,Y 274: 112 768 y,Y 274: 113 768 y,Y 274: 114 768 y,Y 274: 115 768 list index out of range

1016 y,Y 274: 116 768 list index out of range

1023 y,Y 274: 117 768 y,Y 274: 118 768 list index out of range

1022 y,Y 274: 119 768 y,Y 274: 120 768 y,Y 274: 121 768 y,Y 274: 122 768 y,Y 274: 123 768 list index out of range

1021 y,Y 274: 124 768 list index out of range

1021 y,Y 274: 125 768 y,Y 274: 126 768 list index out of range

1016 y,Y 274: 127 768 y,Y 274: 128 768 y,Y 274: 129 768 y,Y 274: 130 768 y,Y 274: 131 768 list index out of range

1017 y,Y 274: 132 768 y,Y 274: 133 768 y,Y 274: 134 768 y,Y 274: 135 768 y,Y 274: 136 768 list index out of range

1016 list index out of range

1017 list index out of range

1017 y,Y 274: 137 768 y,Y 274: 138 768 y,Y 274: 139 768 y,Y 274: 140 768 y,Y 274: 141 768 y,Y 274: 142 768 list index out of range

1022 list index out of range

1022 y,Y 274: 143 768 y,Y 274: 144 768 y,Y 274: 145 768 y,Y 274: 146 768 list index out of range

993 y,Y 274: 147 768 list index out of range

995 list index out of range

1010 list index out of range

1011 y,Y 274: 148 768 list index out of range

1023 y,Y 274: 149 768 y,Y 274: 150 768 y,Y 274: 151 768 list index out of range

1017 list index out of range

1018 y,Y 274: 152 768 y,Y 274: 153 768 list index out of range

1018 y,Y 274: 154 768 y,Y 274: 155 768 y,Y 274: 156 768 y,Y 274: 157 768 list index out of range

1010 y,Y 274: 158 768 list index out of range

1012 y,Y 274: 159 768 list index out of range

1010 y,Y 274: 160 768 list index out of range

1019 list index out of range

1023 y,Y 274: 161 768 y,Y 274: 162 768 y,Y 274: 163 768 y,Y 274: 164 768 y,Y 274: 165 768 y,Y 274: 166 768 y,Y 274: 167 768 y,Y 274: 168 768 list index out of range

1016 y,Y 274: 169 768 y,Y 274: 170 768 y,Y 274: 171 768 y,Y 274: 172 768 list index out of range

991 list index out of range

1001 y,Y 274: 173 768 list index out of range

976 list index out of range

1002 list index out of range

1003 y,Y 274: 174 768 list index out of range

1021 y,Y 274: 175 768 y,Y 274: 176 768 y,Y 274: 177 768 y,Y 274: 178 768 y,Y 274: 179 768 list index out of range

1021 y,Y 274: 180 768 y,Y 274: 181 768 list index out of range

1023 y,Y 274: 182 768 y,Y 274: 183 768 y,Y 274: 184 768 y,Y 274: 185 768 y,Y 274: 186 768 y,Y 274: 187 768 y,Y 274: 188 768 y,Y 274: 189 768 y,Y 274: 190 768 y,Y 274: 191 768 y,Y 274: 192 768 y,Y 274: 193 768 list index out of range

1015 y,Y 274: 194 768 list index out of range

1022 y,Y 274: 195 768 list index out of range

1022 y,Y 274: 196 768 list index out of range

1021 y,Y 274: 197 768 y,Y 274: 198 768 y,Y 274: 199 768 y,Y 274: 200 768 y,Y 274: 201 768 y,Y 274: 202 768 y,Y 274: 203 768 y,Y 274: 204 768 y,Y 274: 205 768 y,Y 274: 206 768 y,Y 274: 207 768 list index out of range

1021 y,Y 274: 208 768 list index out of range

1017 y,Y 274: 209 768 y,Y 274: 210 768 y,Y 274: 211 768 y,Y 274: 212 768 y,Y 274: 213 768 list index out of range

1021 y,Y 274: 214 768 y,Y 274: 215 768 y,Y 274: 216 768 list index out of range

1022 y,Y 274: 217 768 y,Y 274: 218 768 y,Y 274: 219 768 y,Y 274: 220 768 list index out of range

1023 y,Y 274: 221 768 y,Y 274: 222 768 y,Y 274: 223 768 y,Y 274: 224 768 y,Y 274: 225 768 y,Y 274: 226 768 y,Y 274: 227 768 list index out of range

1006 list index out of range

1016 list index out of range

1016 y,Y 274: 228 768 y,Y 274: 229 768 y,Y 274: 230 768 y,Y 274: 231 768 y,Y 274: 232 768 y,Y 274: 233 768 list index out of range

1021 y,Y 274: 234 768 y,Y 274: 235 768 y,Y 274: 236 768 y,Y 274: 237 768 y,Y 274: 238 768 y,Y 274: 239 768 list index out of range

1022 y,Y 274: 240 768 y,Y 274: 241 768 y,Y 274: 242 768 y,Y 274: 243 768 list index out of range

1018 y,Y 274: 244 768 y,Y 274: 245 768 y,Y 274: 246 768 y,Y 274: 247 768 y,Y 274: 248 768 y,Y 274: 249 768 y,Y 274: 250 768 y,Y 274: 251 768 y,Y 274: 252 768 y,Y 274: 253 768 y,Y 274: 254 768 y,Y 274: 255 768 y,Y 274: 256 768 y,Y 274: 257 768 y,Y 274: 258 768 y,Y 274: 259 768 y,Y 274: 260 768 list index out of range

1021 y,Y 274: 261 768 y,Y 274: 262 768 y,Y 274: 263 768 y,Y 274: 264 768 y,Y 274: 265 768 y,Y 274: 266 768 list index out of range

1023 y,Y 274: 267 768 list index out of range

1018 y,Y 274: 268 768 list index out of range

1022 y,Y 274: 269 768 y,Y 274: 270 768 y,Y 274: 271 768 y,Y 274: 272 768 y,Y 274: 273 768 y,Y 274: 274 768 y,Y 274: 275 768 y,Y 274: 276 768 list index out of range

1006 y,Y 274: 277 768 list index out of range

1002 y,Y 274: 278 768 list index out of range

1003 list index out of range

1018 y,Y 274: 279 768 y,Y 274: 280 768 list index out of range

1016 list index out of range

1022 y,Y 274: 281 768 y,Y 274: 282 768 list index out of range

1022 y,Y 274: 283 768 list index out of range

1000 y,Y 274: 284 768 list index out of range

1016 y,Y 274: 285 768 list index out of range

1022 list index out of range

1022 y,Y 274: 286 768 list index out of range

1013 y,Y 274: 287 768 y,Y 274: 288 768 y,Y 274: 289 768 y,Y 274: 290 768 y,Y 274: 291 768 list index out of range

1006 y,Y 274: 292 768 list index out of range

994 list index out of range

1016 y,Y 274: 293 768 list index out of range

1001 y,Y 274: 294 768 y,Y 274: 295 768 y,Y 274: 296 768 y,Y 274: 297 768 y,Y 274: 298 768 list index out of range

1006 list index out of range

1019 y,Y 274: 299 768 list index out of range

987 y,Y 274: 300 768 y,Y 274: 301 768 y,Y 274: 302 768 y,Y 274: 303 768 y,Y 274: 304 768 y,Y 274: 305 768 y,Y 274: 306 768 y,Y 274: 307 768 list index out of range

1018 y,Y 274: 308 768 list index out of range

1020 y,Y 274: 309 768 y,Y 274: 310 768 y,Y 274: 311 768 list index out of range

1012 y,Y 274: 312 768 y,Y 274: 313 768 y,Y 274: 314 768 y,Y 274: 315 768 y,Y 274: 316 768 y,Y 274: 317 768 y,Y 274: 318 768 y,Y 274: 319 768 y,Y 274: 320 768 y,Y 274: 321 768 y,Y 274: 322 768 y,Y 274: 323 768 y,Y 274: 324 768 y,Y 274: 325 768 y,Y 274: 326 768 y,Y 274: 327 768 y,Y 274: 328 768 y,Y 274: 329 768 y,Y 274: 330 768 y,Y 274: 331 768 y,Y 274: 332 768 y,Y 274: 333 768 y,Y 274: 334 768 list index out of range

1022 list index out of range

1022 y,Y 274: 335 768 y,Y 274: 336 768 list index out of range

1022 list index out of range

1022 y,Y 274: 337 768 list index out of range

1014 list index out of range

1018 y,Y 274: 338 768 list index out of range

1021 y,Y 274: 339 768 y,Y 274: 340 768 list index out of range

1020 y,Y 274: 341 768 y,Y 274: 342 768 y,Y 274: 343 768 list index out of range

1008 list index out of range

1022 y,Y 274: 344 768 list index out of range

1018 list index out of range

1023 y,Y 274: 345 768 y,Y 274: 346 768 y,Y 274: 347 768 list index out of range

1008 y,Y 274: 348 768 list index out of range

1017 y,Y 274: 349 768 y,Y 274: 350 768 list index out of range

1023 y,Y 274: 351 768 y,Y 274: 352 768 y,Y 274: 353 768 y,Y 274: 354 768 y,Y 274: 355 768 y,Y 274: 356 768 y,Y 274: 357 768 y,Y 274: 358 768 list index out of range

1020 y,Y 274: 359 768 list index out of range

1020 y,Y 274: 360 768 list index out of range

1018 y,Y 274: 361 768 y,Y 274: 362 768 y,Y 274: 363 768 y,Y 274: 364 768 y,Y 274: 365 768 list index out of range

1023 y,Y 274: 366 768 y,Y 274: 367 768 y,Y 274: 368 768 y,Y 274: 369 768 y,Y 274: 370 768 y,Y 274: 371 768 y,Y 274: 372 768 y,Y 274: 373 768 y,Y 274: 374 768 list index out of range

1018 y,Y 274: 375 768 y,Y 274: 376 768 y,Y 274: 377 768 y,Y 274: 378 768 list index out of range

1023 y,Y 274: 379 768 list index out of range

1023 y,Y 274: 380 768 y,Y 274: 381 768 list index out of range

1023 list index out of range

1023 y,Y 274: 382 768 y,Y 274: 383 768 y,Y 274: 384 768 y,Y 274: 385 768 list index out of range

1023 y,Y 274: 386 768 list index out of range

1023 y,Y 274: 387 768 y,Y 274: 388 768 y,Y 274: 389 768 y,Y 274: 390 768 y,Y 274: 391 768 y,Y 274: 392 768 list index out of range

1023 y,Y 274: 393 768 list index out of range

1014 list index out of range

1022 y,Y 274: 394 768 y,Y 274: 395 768 y,Y 274: 396 768 y,Y 274: 397 768 y,Y 274: 398 768 y,Y 274: 399 768 y,Y 274: 400 768 y,Y 274: 401 768 y,Y 274: 402 768 y,Y 274: 403 768 y,Y 274: 404 768 y,Y 274: 405 768 y,Y 274: 406 768 list index out of range

1000 y,Y 274: 407 768 list index out of range

1005 y,Y 274: 408 768 y,Y 274: 409 768 y,Y 274: 410 768 y,Y 274: 411 768 y,Y 274: 412 768 y,Y 274: 413 768 y,Y 274: 414 768 y,Y 274: 415 768 y,Y 274: 416 768 y,Y 274: 417 768 y,Y 274: 418 768 y,Y 274: 419 768 y,Y 274: 420 768 y,Y 274: 421 768 y,Y 274: 422 768 y,Y 274: 423 768 list index out of range

1016 list index out of range

1023 y,Y 274: 424 768 list index out of range

1012 list index out of range

1018 y,Y 274: 425 768 list index out of range

1017 y,Y 274: 426 768 y,Y 274: 427 768 y,Y 274: 428 768 y,Y 274: 429 768 y,Y 274: 430 768 y,Y 274: 431 768 list index out of range

1005 y,Y 274: 432 768 y,Y 274: 433 768 list index out of range

1018 y,Y 274: 434 768 y,Y 274: 435 768 y,Y 274: 436 768 y,Y 274: 437 768 y,Y 274: 438 768 list index out of range

1008 y,Y 274: 439 768 y,Y 274: 440 768 y,Y 274: 441 768 list index out of range

1017 list index out of range

1018 y,Y 274: 442 768 list index out of range

1020 y,Y 274: 443 768 y,Y 274: 444 768 y,Y 274: 445 768 y,Y 274: 446 768 list index out of range

1015 y,Y 274: 447 768 list index out of range

1013 y,Y 274: 448 768 list index out of range

1013 list index out of range

1015 list index out of range

1019 y,Y 274: 449 768 list index out of range

1017 list index out of range

1022 y,Y 274: 450 768 list index out of range

1019 y,Y 274: 451 768 y,Y 274: 452 768 y,Y 274: 453 768 y,Y 274: 454 768 y,Y 274: 455 768 y,Y 274: 456 768 y,Y 274: 457 768 list index out of range

1018 list index out of range

1023 y,Y 274: 458 768 y,Y 274: 459 768 y,Y 274: 460 768 y,Y 274: 461 768 y,Y 274: 462 768 y,Y 274: 463 768 y,Y 274: 464 768 y,Y 274: 465 768 list index out of range

1002 list index out of range

1005 y,Y 274: 466 768 y,Y 274: 467 768 y,Y 274: 468 768 y,Y 274: 469 768 y,Y 274: 470 768 y,Y 274: 471 768 list index out of range

1023 y,Y 274: 472 768 list index out of range

1007 y,Y 274: 473 768 y,Y 274: 474 768 y,Y 274: 475 768 list index out of range

1007 list index out of range

1016 y,Y 274: 476 768 y,Y 274: 477 768 list index out of range

1021 y,Y 274: 478 768 y,Y 274: 479 768 y,Y 274: 480 768 y,Y 274: 481 768 y,Y 274: 482 768 list index out of range

1021 y,Y 274: 483 768 y,Y 274: 484 768 y,Y 274: 485 768 list index out of range

1020 list index out of range

1020 y,Y 274: 486 768 y,Y 274: 487 768 list index out of range

1000 y,Y 274: 488 768 y,Y 274: 489 768 y,Y 274: 490 768 list index out of range

1022 y,Y 274: 491 768 list index out of range

1023 y,Y 274: 492 768 y,Y 274: 493 768 list index out of range

1021 y,Y 274: 494 768 y,Y 274: 495 768 y,Y 274: 496 768 y,Y 274: 497 768 y,Y 274: 498 768 list index out of range

1013 y,Y 274: 499 768 y,Y 274: 500 768 y,Y 274: 501 768 y,Y 274: 502 768 y,Y 274: 503 768 y,Y 274: 504 768 y,Y 274: 505 768 y,Y 274: 506 768 y,Y 274: 507 768 list index out of range

1016 y,Y 274: 508 768 list index out of range

1021 y,Y 274: 509 768 list index out of range

1018 y,Y 274: 510 768 list index out of range

1023 y,Y 274: 511 768 y,Y 274: 512 768 y,Y 274: 513 768 y,Y 274: 514 768 y,Y 274: 515 768 y,Y 274: 516 768 y,Y 274: 517 768 y,Y 274: 518 768 y,Y 274: 519 768 y,Y 274: 520 768 list index out of range

1018 y,Y 274: 521 768 list index out of range

1009 y,Y 274: 522 768 y,Y 274: 523 768 y,Y 274: 524 768 y,Y 274: 525 768 list index out of range

1020 y,Y 274: 526 768 y,Y 274: 527 768 y,Y 274: 528 768 y,Y 274: 529 768 y,Y 274: 530 768 y,Y 274: 531 768 list index out of range

1022 y,Y 274: 532 768 list index out of range

1022 y,Y 274: 533 768 list index out of range

1019 y,Y 274: 534 768 y,Y 274: 535 768 list index out of range

1018 y,Y 274: 536 768 y,Y 274: 537 768 list index out of range

1020 y,Y 274: 538 768 list index out of range

1022 y,Y 274: 539 768 list index out of range

1010 y,Y 274: 540 768 list index out of range

1019 list index out of range

1019 y,Y 274: 541 768 list index out of range

1021 y,Y 274: 542 768 y,Y 274: 543 768 y,Y 274: 544 768 y,Y 274: 545 768 y,Y 274: 546 768 list index out of range

1007 y,Y 274: 547 768 list index out of range

1018 y,Y 274: 548 768 list index out of range

1016 y,Y 274: 549 768 list index out of range

1018 list index out of range

1023 y,Y 274: 550 768 y,Y 274: 551 768 y,Y 274: 552 768 y,Y 274: 553 768 y,Y 274: 554 768 y,Y 274: 555 768 y,Y 274: 556 768 y,Y 274: 557 768 y,Y 274: 558 768 y,Y 274: 559 768 y,Y 274: 560 768 list index out of range

1010 y,Y 274: 561 768 y,Y 274: 562 768 y,Y 274: 563 768 list index out of range

983 list index out of range

1016 y,Y 274: 564 768 y,Y 274: 565 768 list index out of range

1006 y,Y 274: 566 768 list index out of range

1000 y,Y 274: 567 768 y,Y 274: 568 768 y,Y 274: 569 768 y,Y 274: 570 768 y,Y 274: 571 768 y,Y 274: 572 768 y,Y 274: 573 768 y,Y 274: 574 768 list index out of range

1014 y,Y 274: 575 768 list index out of range

1020 y,Y 274: 576 768 y,Y 274: 577 768 y,Y 274: 578 768 y,Y 274: 579 768 y,Y 274: 580 768 list index out of range

1021 y,Y 274: 581 768 y,Y 274: 582 768 list index out of range

1020 y,Y 274: 583 768 list index out of range

1021 y,Y 274: 584 768 list index out of range

1012 y,Y 274: 585 768 list index out of range

1012 list index out of range

1017 list index out of range

1022 y,Y 274: 586 768 y,Y 274: 587 768 y,Y 274: 588 768 y,Y 274: 589 768 list index out of range

1019 y,Y 274: 590 768 list index out of range

1021 y,Y 274: 591 768 y,Y 274: 592 768 y,Y 274: 593 768 list index out of range

1022 y,Y 274: 594 768 y,Y 274: 595 768 y,Y 274: 596 768 y,Y 274: 597 768 y,Y 274: 598 768 y,Y 274: 599 768 y,Y 274: 600 768 y,Y 274: 601 768 list index out of range

1021 y,Y 274: 602 768 list index out of range

1023 y,Y 274: 603 768 y,Y 274: 604 768 y,Y 274: 605 768 y,Y 274: 606 768 y,Y 274: 607 768 y,Y 274: 608 768 y,Y 274: 609 768 y,Y 274: 610 768 y,Y 274: 611 768 y,Y 274: 612 768 list index out of range

992 list index out of range

995 list index out of range

1009 y,Y 274: 613 768 y,Y 274: 614 768 list index out of range

1015 list index out of range

1015 y,Y 274: 615 768 y,Y 274: 616 768 y,Y 274: 617 768 y,Y 274: 618 768 y,Y 274: 619 768 list index out of range

1019 y,Y 274: 620 768 y,Y 274: 621 768 list index out of range

1020 y,Y 274: 622 768 list index out of range

1021 y,Y 274: 623 768 list index out of range

1021 y,Y 274: 624 768 list index out of range

1020 y,Y 274: 625 768 y,Y 274: 626 768 y,Y 274: 627 768 list index out of range

1000 y,Y 274: 628 768 list index out of range

998 y,Y 274: 629 768 y,Y 274: 630 768 y,Y 274: 631 768 y,Y 274: 632 768 list index out of range

1012 list index out of range

1014 y,Y 274: 633 768 y,Y 274: 634 768 y,Y 274: 635 768 y,Y 274: 636 768 list index out of range

1020 y,Y 274: 637 768 y,Y 274: 638 768 list index out of range

1010 y,Y 274: 639 768 y,Y 274: 640 768 y,Y 274: 641 768 y,Y 274: 642 768 y,Y 274: 643 768 list index out of range

1006 list index out of range

1012 y,Y 274: 644 768 list index out of range

1022 y,Y 274: 645 768 y,Y 274: 646 768 y,Y 274: 647 768 y,Y 274: 648 768 list index out of range

1021 y,Y 274: 649 768 y,Y 274: 650 768 y,Y 274: 651 768 y,Y 274: 652 768 y,Y 274: 653 768 y,Y 274: 654 768 y,Y 274: 655 768 list index out of range

1023 y,Y 274: 656 768 y,Y 274: 657 768 y,Y 274: 658 768 y,Y 274: 659 768 y,Y 274: 660 768 y,Y 274: 661 768 y,Y 274: 662 768 y,Y 274: 663 768 list index out of range

1023 y,Y 274: 664 768 y,Y 274: 665 768 list index out of range

1021 y,Y 274: 666 768 y,Y 274: 667 768 y,Y 274: 668 768 list index out of range

1023 y,Y 274: 669 768 y,Y 274: 670 768 y,Y 274: 671 768 list index out of range

1015 list index out of range

1019 y,Y 274: 672 768 y,Y 274: 673 768 y,Y 274: 674 768 y,Y 274: 675 768 y,Y 274: 676 768 y,Y 274: 677 768 y,Y 274: 678 768 y,Y 274: 679 768 y,Y 274: 680 768 y,Y 274: 681 768 y,Y 274: 682 768 y,Y 274: 683 768 y,Y 274: 684 768 y,Y 274: 685 768 y,Y 274: 686 768 list index out of range

1022 y,Y 274: 687 768 y,Y 274: 688 768 y,Y 274: 689 768 y,Y 274: 690 768 y,Y 274: 691 768 y,Y 274: 692 768 list index out of range

998 list index out of range

1000 y,Y 274: 693 768 y,Y 274: 694 768 list index out of range

1020 y,Y 274: 695 768 list index out of range

1023 y,Y 274: 696 768 y,Y 274: 697 768 y,Y 274: 698 768 list index out of range

1010 y,Y 274: 699 768 list index out of range

1012 y,Y 274: 700 768 y,Y 274: 701 768 y,Y 274: 702 768 y,Y 274: 703 768 y,Y 274: 704 768 y,Y 274: 705 768 y,Y 274: 706 768 y,Y 274: 707 768 y,Y 274: 708 768 y,Y 274: 709 768 list index out of range

1010 y,Y 274: 710 768 list index out of range

1010 list index out of range

1018 y,Y 274: 711 768 y,Y 274: 712 768 y,Y 274: 713 768 y,Y 274: 714 768 list index out of range

1003 y,Y 274: 715 768 list index out of range

1010 y,Y 274: 716 768 y,Y 274: 717 768 list index out of range

1007 list index out of range

1013 y,Y 274: 718 768 list index out of range

1013 list index out of range

1015 y,Y 274: 719 768 list index out of range

1015 y,Y 274: 720 768 list index out of range

1023 y,Y 274: 721 768 y,Y 274: 722 768 y,Y 274: 723 768 y,Y 274: 724 768 y,Y 274: 725 768 y,Y 274: 726 768 y,Y 274: 727 768 y,Y 274: 728 768 y,Y 274: 729 768 y,Y 274: 730 768 list index out of range

1021 y,Y 274: 731 768 list index out of range

1023 y,Y 274: 732 768 y,Y 274: 733 768 y,Y 274: 734 768 y,Y 274: 735 768 y,Y 274: 736 768 y,Y 274: 737 768 list index out of range

1019 y,Y 274: 738 768 y,Y 274: 739 768 y,Y 274: 740 768 list index out of range

1013 y,Y 274: 741 768 list index out of range

1018 y,Y 274: 742 768 y,Y 274: 743 768 y,Y 274: 744 768 y,Y 274: 745 768 y,Y 274: 746 768 y,Y 274: 747 768 y,Y 274: 748 768 y,Y 274: 749 768 list index out of range

1023 y,Y 274: 750 768 y,Y 274: 751 768 list index out of range

1018 y,Y 274: 752 768 y,Y 274: 753 768 y,Y 274: 754 768 y,Y 274: 755 768 y,Y 274: 756 768 y,Y 274: 757 768 y,Y 274: 758 768 list index out of range

1016 y,Y 274: 759 768 list index out of range

1019 y,Y 274: 760 768 list index out of range

1018 y,Y 274: 761 768 list index out of range

1021 list index out of range

1021 y,Y 274: 762 768 y,Y 274: 763 768 list index out of range

1018 y,Y 274: 764 768 list index out of range

1016 y,Y 274: 765 768 y,Y 274: 766 768 list index out of range

1018 y,Y 274: 767 768 26.156800270080566 None

boris-kz commented 6 years ago

On Sun, Sep 9, 2018 at 3:49 PM Todor Arnaudov notifications@github.com wrote:

There you are one report, see below (a long printout). Out of range: 307 times on a run with the raccoon, with the older version (sorry, I'll load the newer ones later, I wanted to test the principle).

One obvious discovery is that the out of range comes near the end of the lines or at the very end (1023). The leftmost is at 976.

Thanks. That shouldn't happen with the new code because evaluation is now conditional on:

if x < X - 200: # right error margin: >len(fork_P[6])

Note that the try-except block had to embrace below, because blob is

defined in the failing IF block, so if the try-except is only for this one, the following blocks fail with using an undefined variable blob.

So blob etc. have to be set with some default values, you know which ones,

Yes, I was going to declare it as a tuple of lists, but didn't think it was necessary

boris-kz commented 6 years ago

(main()) It's for another module, say "test.py" that'd import it:

import frame_dblobs.py

Then call the functions externally with a data set, store the patterns etc.

If there's no such check, it will first execute the code in the global part

of the module when importing it, which won't be desirable.

With the check, the module can be either run stand-alone or imported

without that by-effect.

Ok, that seems a bit specific, don't know if I want it as a default here.

Try-except may be generated automatically if they are needed at a large scal.

Your code could be as it is, it will be parsed (simply, just searching for IF ... or particular blocks, not complete analysis).

Debug and logging code will be added in a copy of your code and the copy would be executed.

So you could develop yours without the clutter, the test version will include additional stuff.

That could be assisted with so called annotations in the comments, e.g.:

if blah-blah: # @{ ...

@ }

Meaning "surround with try-catch" or whatever needed.

The preprocessor would watch for line numbers etc. and would display them.

Yes, you could observe it online while debugging, but this may collect the wanted specific information and present it as wanted.

Thanks, I will look into this. But what really need is values of variables before an error, whatever the error is. Pycharm is telling me that "variables are not available" You got them from np.shape(pylist)? How does that work? Or Pickle, but I only need a few steps back, not the whole log? Some kind of running buffer?

Twenkid commented 6 years ago

But what really need is values of variables before an error, whatever the error is. Pycharm is telling me that "variables are not available" You got them from np.shape(pylist)? How does that work? Or Pickle, but I only need a few steps back, not the whole log? Some kind of running buffer?

I used simple printing. To get it as output, without saving to a file explicitly, it's just:

python dblobs.py > log.txt

The output of print would go there.

np.shape(list) returns just the shape of a general list.

Better than print is to save to a log-structure and to a file/to files explicitly with flags on/off. The data overload could be partially solved by pausing the run or saving not all data, but just a limited amount of times during the run (if there still are many exceptions).

...

Yes, it could be a running buffer, like a stack, where the desired variables are stored and after certain amount of steps the oldest are removed.

When there's an error - print that buffer or collect these recent data for another log.

You could also print for every step, but pause the run when there's an error:

cv2.waitKey(0) #any key

blah = input("prompt") #waits for enter

...

BTW, maybe the most appropriate way is to use Python's reflection functions.

Some simple ones are:

globals() returns a dictionary of the global vars (global var....) - both names and values locals() - the ones in the function/local running context

So they can be printed directly or more pretty by a simple iteration, not manually per variable. They can also be changed, new variables added, the values changed by iterating this structure.

context = locals()
print(context)

wanted_vars = ['P', '_P', 'blob', 'ix', '_ix']  #whatever identifiers

for w in wanted_vars:
  print(context[w])

http://blog.lerner.co.il/four-ways-to-assign-variables-in-python/

...

I see there's a new more sophisticated module in Python 3.7:

https://docs.python.org/3/library/contextvars.html https://stackoverflow.com/questions/50854974/context-variables-in-python https://twitter.com/raymondh/status/959455137893793793

It's possible that Pycharm loses the variables due to those "thread leaching" explained in the twitter post.

We'll see, maybe locals()/globals() would be enough, though, I don't know.

...

I guess you don't know to rerun wrong cycles with altered data, but I imagine the following possible procedure, which I think is interesting and may be useful later.

So, the context would be copied in the beginning of a function, maybe just locals() would be enough, or also in the beginning or/and ending of each iteration, as long as the memory allows.

...

The function content would be nested within two other loops, though, in order to allow rerunning.


import contextvars #if needed
#locals, globals are built-in functions

def scan(....):
  # copy the local context or selected items  
  while initfunc:**
  ...  # immediately after the header
  ... # load the context from the copy - it could be changed from the exception block
      # init loop variables
    while processing: # embraces your processing loop
       while _ix <= x:  # the actual loop
         if (not processing): break 
         #or at the end of the cycle, after the exception handling t

       try: 
       ...
       except ... : 
          # print the local context
          # copy the old context, change some of the variables if desirable
          processing = False
          continue   # in general, when using continue in a while loop, it's important not to forget to check whether the counters/logic is updated in the alternative branches, in order to avoid endless loops          
      ...

        #end of while _ix<...

When there's an exception, you could log, dump etc., but also revert the values to the copied ones, change some and rerun the current loop/the function with the changed values*.

If everything is OK, "processing" flag would be set to True and the cycle would end when ix<... finishes.

In the exception block, it would be set to False, then checked and the processing loop - thus interrupted.

Then control goes back to the upper loop, the local variables/parameters are reset.

The context could be saved on each cycle in a running round-robin buffer.

...

That changing of values could go also for adjusting some border cases in experimental comparisons, like these y < ... x < ... In case of an error, several combinations could be checked automatically (or by default many could be checked in case of an error in some of them, then logged which one worked).

...

The (correct) reverting could happen to be more sophisticated, because you may have to pop something out accordingly; or the pushing to the lists may be postponed a little - done in the end of the loops, after the iterations have went without errors.

That would be hindered or complicated if the intermediate code depends on the pushes to the lists.

So if everything is fine, the values would be pushed to the main pattern-store, otherwise they would be cancelled and the cycle would be adjusted and repeated.

Well, these are "transactions". Perhaps there's an easier way to do that, also, something like simple databases, if needed.

They may come in handy due to the general query language for generating reports.

e.g.

SELECT * from scanP where len_f_list = 0

SELECT * from scanP where len_f_list > 5 and ix > 245 and ...

These are just the simplest use-cases.

...

As of the clutter that would grow --> the custom preprocessor, that would add all these additional functions over the generic code.

boris-kz commented 6 years ago

On Mon, Sep 10, 2018 at 6:15 AM Todor Arnaudov notifications@github.com wrote:

But what really need is values of variables before an error, whatever the error is. Pycharm is telling me that "variables are not available" You got them from np.shape(pylist)? How does that work? Or Pickle, but I only need a few steps back, not the whole log? Some kind of running buffer?

I used simple printing. To get it as output, without saving to a file explicitly, it's just:

python dblobs.py > log.txt

The output of print would go there.

np.shape(list) returns just the shape of a general list.

Better than print is to save to a log-structure and to a file/to files explicitly with flags on/off. The data overload could be partially solved by pausing the run or saving not all data, but just a limited amount of times during the run (if there still are many exceptions).

...

Yes, it could be a running buffer, like a stack, where the desired variables are stored and after certain amount of steps the oldest are removed.

When there's an error - print that buffer or collect these recent data for another log.

You could also print for every step, but pause the run when there's an error:

cv2.waitKey(0) #any key

or

blah = input("prompt") #waits for enter

...

BTW, maybe the most appropriate way is to use Python's reflection functions.

Some simple ones are:

globals() returns a dictionary of the global vars (global var....) - both names and values locals() - the ones in the function/local running context

So they can be printed directly or more pretty by a simple iteration, not manually per variable.

context = locals() print(context)

wanted_vars = ['P', '_P', 'blob', 'ix', '_ix'] #whatever identifiers

for w in wanted_vars: print(context[w])

http://blog.lerner.co.il/four-ways-to-assign-variables-in-python/

...

I see there's a new more sophisticated module in Python 3.7:

I am using 2.7, what does it take to convert it to 3?

https://docs.python.org/3/library/contextvars.html https://stackoverflow.com/questions/50854974/context-variables-in-python https://twitter.com/raymondh/status/959455137893793793

It's possible that Pycharm loses the variables due to those "thread leaching" explained in the twitter post.

We'll see, maybe locals()/globals() would be enough, though, I don't know.

...

I guess you don't know to rerun wrong cycles with altered data, but I imagine the following possible procedure, which I think is interesting and may be useful later.

So, the context would be copied in the beginning of a function, maybe just locals() would be enough, or also in the beginning or/and ending of each iteration, as long as the memory allows.

...

The function content would be nested within two other loops, though, in order to allow rerunning.

import contextvars #if needed

locals, globals are built-in functions

def scan(....):

copy the local context or selected items

while initfunc:** ... # immediately after the header ... # load the context from the copy - it could be changed from the exception block

init loop variables
while processing: # embraces your processing loop
   while _ix <= x:  # the actual loop
     if (not processing): break
     #or at the end of the cycle, after the exception handling t

   try:
   ...
   except ... :
      # print the local context
      # copy the old context, change some of the variables if desirable
      processing = False
      continue   # in general, when using continue in a while loop, it's important not to forget to check whether the counters/logic is updated in the alternative branches, in order to avoid endless loops
  ...

    #end of while _ix<...
When there's an exception, you could log, dump etc., but also revert the values to the copied ones, change some and rerun the current loop/the function with the changed values*.

If everything is OK, "processing" flag would be set to True and the cycle would end when ix<... finishes.

In the exception block, it would be set to False, then checked and the processing loop - thus interrupted.

Then control goes back to the upper loop, the local variables/parameters are reset.

The context could be saved on each cycle in a running round-robin buffer.

...

That changing of values could go also for adjusting some border cases in experimental comparisons, like these y < ... x < ... In case of an error, several combinations could be checked automatically (or by default many could be checked in case of an error in some of them, then logged which one worked).

...

The (correct) reverting could happen to be more sophisticated, because you may have to pop something out accordingly; or the pushing to the lists may be postponed a little - done in the end of the loops, after the iterations have went without errors.

That would be hindered or complicated if the intermediate code depends on the pushes to the lists.

So if everything is fine, the values would be pushed to the main pattern-store, otherwise they would be cancelled and the cycle would be adjusted and repeated.

Well, these are "transactions". Perhaps there's an easier way to do that, also, something like simple databases, if needed.

They may come in handy due to the general query language for generating reports.

e.g.

SELECT * from scanP where len_f_list = 0

SELECT * from scanP where len_f_list > 5 and ix > 245 and ...

These are just the simplest use-cases.

...

As of the clutter that would grow --> the custom preprocessor, that would add all these additional functions over the generic code.

Ok, that's all great, could you modify scanP so it will print values of locals before an error | processing stops? Thanks.

Twenkid commented 6 years ago

Conversion - in your case maybe nothing. Py2 accepts print without parentheses, there are differences in some libraries syntax and parameters and new features, renamed functions, but your code is generic.

The required libraries you import have to be reinstalled in their versions for 3, though

I use Py3 and didn't have to convert it.

OK about scan_P.

Twenkid commented 6 years ago

Please see the code and the comments: https://github.com/Twenkid/CogAlg/blob/master/frame_dblobs_debug.py

I used additional modules: copy and re (regular expressions) to record the states correctly and to format the printouts a bit, because the Python default dumps lack new lines and the average text editors are punished. (Special ones are needed for big files.)

There should be some modules for "pretty print" like this one: https://docs.python.org/2/library/pprint.html

But I think my simple reg-exes are fine for a start.

...

One suggestion: I think putting mnemonic stable labels in comments at the important comparisons or processing, as with these tracking labels, would be helpful, because the line numbers change more often. Also, these labels could be automatically used for generating such record-points with meaningful names (an automatic label-generator could be made also.)

(A custom debugger should show the actual code as well, of course)

BTW, I noticed also that Pycharm sometimes disturbs the formatting, adds new lines and spaces when copy-pasting code. (For working with just one file, cloning from a repository, creating new folders etc. is an overkill.)

prefork005_preloop call_preloop

boris-kz commented 6 years ago

Thanks a lot, Todor! This is certainly a good start, although my blob variables are too complex: pretty much unintelligible without hierarchical unfolding (as in pycharm variables). Do you know how they do it, or maybe we can automatically display and buffer them?

Just stepping through the code in pycharm, I noticed that the first error is at y==7, x~823. Which is weird because error line: if fork[0][0][5] == 1: shouldn't run at x > X - 200?

Twenkid commented 6 years ago

I'm glad it's in a good direction.

Just stepping through the code in pycharm, I noticed that the first error is at y==7, x~823. Which is weird because error line: if fork[0][0][5] == 1: shouldn't run at x > X - 200?

Isn't it a row earlier? The log from my run, displayed in the screenshots, shows x = 822. Also, wasn't the shape 768,1024, i.e. 1024-200 = 824?

This is certainly a good start, although my blob variables are too complex: pretty much unintelligible without hierarchical unfolding (as in pycharm variables). Do you know how they do it, or maybe we can automatically display and buffer them?

Well, I have a few ideas.

Debuggers are supposed to modify the original code with additional checks (assertions); they add breakpoints (the CPUs themselves have functions for debug-oriented breaks) and have full access to the runtime structures. In Python, the user also has, through the reflection functions like locals().

One solution:

A preprocessing code generator would add watching code before each line, or more selectively - only before code which is not obvious or is erroneus, like with these locals(), but more complex. During analysis, the generator knows current line number and current file.
```
This:
```

if x < X - 200: # right error margin: >len(fork_P[6])? ini = 1 if y > rng * 2 + 1: # beyond the first line of _Ps if len(fork) == 1: # always > 0: fork_ appended outside scanP? if fork[0][0][5] == 1: # _fork roots, see ln161, never == 1? blob_seg = form_blob_seg(fork[0], _P, _x) # _P (y-2) is packed in fork[0] blob segment + _fork (y-3) ini = 0 # no seg initialization return ini, blob_seg


**Could turn to something like that:**

#watched   - selected variables which have to be recorded
#assertions - checks of some of the variables, it would be an object of a class at best - this is optional, probably not needed for now, the code itself does checks and breaks on errors

Debug(151, locals(), watched, assertions); #the following line is 151 in the original file try: if x < X - 200: # right error margin: >len(fork_P[6])? ini = 1 if y > rng * 2 + 1: # beyond the first line of _Ps if len(fork) == 1: # always > 0: fork_ appended outside scanP? if fork[0][0][5] == 1: # _fork roots, see ln161, never == 1? blob_seg = form_blob_seg(fork[0], _P, _x) # _P (y-2) is packed in fork[0] blob segment + _fork (y-3) ini = 0 # no seg initialization return ini, blob_seg ...


With these function, more complex that just logging in the current version, we could pause the execution like Pycharm does - Debug could have logic for waiting for user input, which could be triggered dynamically, and also interactively watch certain variables.

I'd also add debug-processing before each line (not selectively) in order to do **executed code unfolding.**

Instead of jumping within a source file like the normal debuggers do, which is more distracting, because it breaks the continuity, it would print which lines are executed **linearly with indenting and nesting,** with some wached variables or other indications.

It could be with formatting (HTML, colors, font-size) and emphasize or deemphasize this or that. 

It could also be interactive - currently observed segment is drawn like this, say 1000 or 10000 lines around.

This may allow us to  detect visually certain error patterns or anomalies.

**Tracking tuple-list structures**

I'll try pprint or look for other tools, which automatically unfold nested structures when printing.

However, **if you mean the problem of losing the variable identifiers during tuple packing?**

I think of one solution, while still keeping tuples (your choice). 

Maybe it'd be easier if using class objects, because they have a type and their variables are named, but I'm not sure about that, because you're right that it adds clutter. 

With just tuples, I'd traverse the code sequentially as executed and would find the **assignment operations** and **append and pop**. 

Parsing is required, but I developed one in a couple of days in a previous CogAlg session, then discovered that Python itself has functons for that and for compiling. So that kind of parsing is not complex.

The Debug-each-line thing may help by searching for differences between current and previous states.

Say:

blob_seg = _P, [_P], ave_x, 0, fork, len(root_) => blob_seg = _P, [_P], ave_x, 0, fork, len(root_), GetID(161)

buff_.append((_P, _x, fork, root)) ==> buff.append((_P, _x, fork, root_, GetID(149))

The simplest return value for "GetID"  could be just an identifier or a number which is mapped to the different types of variables. 

Thus the types also would have to be enumerated and classified, so that they know what they expect.

When reading those modified tuples back, the left hand side would have to be modified accordingly, too,  and we shouldn't miss an assignment, because it would break the mapping.

` _P, _x, _fork_, root_, **ID** = _buff_.popleft() `

Then the Debug part could use the ID as a type-identifier and print the respective tuples with the respective labels: x, m, d ... Or more information, if we pack the ID to point to somewhere else.

( 
 You once mentioned about tracking the object IDs in Pycharm, I think that's their address, it's what is printed by default if a class object is fed to print():
`<__main__.dP_Class object at 0x7fa247bddc50>`

 However inside the constructed tuples there would be just the id of the compound object. 
)

In general, code without explicit complex named data structures may seem more elegant for nesting etc., but we see the drawbacks. Classes deal with the "type-stuff" automatically, but are more cluttered.

Another solution could be to do a more complex code transformation:

**Annotation of where the data types are defined - the first assignments.**

Then automatic generation of classes, based on that structure.

Then when using the respective structure - using a class object, instead of a tuple.

With proper comments, it could be automated:

E.g.

dP = 0, 0, 0, 0, 0, 0, [] # lateral difference pattern = pris, I, D, Dy, V, Vy, ders2

class dP_Class: init(s, pris, I, D, Dy, V, Vy, ders2): #s - self ... s.pri_s = pris s.I = I s.D = D s.Dy = Dy s.V = V s.Vy = Vy s.ders2 = ders2_

Then:

dP = dP_Class(0,0,0,0,0,0,[])



Then when needed for printing, using reflection functions (they could be syntactically sophisticated, though): 
https://stackoverflow.com/questions/9058305/getting-attributes-of-a-class

Or using our class definitions (with simple parsing of the constructor header:  s, pri_s, I, D, Dy, V, Vy, ders2) in order to print labels.

...

I'd choose the tuple-extension for now. It seems easier and easier to debug.

boris-kz commented 6 years ago

On Wed, Sep 12, 2018 at 6:49 AM Todor Arnaudov notifications@github.com wrote:

I'm glad it's in a good direction.

Just stepping through the code in pycharm, I noticed that the first error is at y==7, x~823. Which is weird because error line: if fork[0][0][5] == 1: shouldn't run at x > X - 200?

Isn't it a row earlier? The log from my run, displayed in the screenshots, shows x = 822.

Ah yes, I didn't see it in pycharm console and haven't used WinMerge yet

Also, wasn't the shape 768,1024, i.e. 1024-200 = 824?

Yes, it's almost the last possible.. So, fork belongs to blobs in line y-1, referring back to patterns in line y, which represent root_ -> roots (_fork[0][0][5]) back in y-1. Which may be beyond the error margin, thus don't exist. So, I may need to put an error margin before calling scanP in form_P, or something...

More on displaying vars latter.

Twenkid commented 6 years ago

BTW, regarding classes, I overcomplicated it - complex reflection is not needed. We know the members, thus a simple function that prints the items, defined in each class, possibly generated by the class constructor parameters list or manual, they are a few, would do the job.

I know the alg. doesn't need classes for now but I speculate that with the progress of understanding and knowing better what it's doing, they may happen to be handy. Including about self-modifying code, adding or removing members from the classes in run time.

boris-kz commented 6 years ago

Could turn to something like that:

watched - selected variables which have to be recorded

assertions - checks of some of the variables, it would be an object of a

class at best - this is optional, probably not needed for now, the code itself does checks and breaks on errors

Debug(151, locals(), watched, assertions); #the following line is 151 in the original file try: if x < X - 200: # right error margin: >len(fork_P[6])? ini = 1 if y > rng * 2 + 1: # beyond the first line of _Ps if len(fork) == 1: # always > 0: fork_ appended outside scanP? if fork[0][0][5] == 1: # _fork roots, see ln161, never == 1? blob_seg = form_blob_seg(fork[0], _P, _x) # _P (y-2) is packed in fork[0] blob segment + _fork (y-3) ini = 0 # no seg initialization return ini, blob_seg ...

With these function, more complex that just logging in the current version, we could pause the execution like Pycharm does - Debug could have logic for waiting for user input, which could be triggered dynamically, and also interactively watch certain variables.

I'd also add debug-processing before each line (not selectively) in order to do executed code unfolding.

Instead of jumping within a source file like the normal debuggers do, which is more distracting, because it breaks the continuity, it would print which lines are executed linearly with indenting and nesting, with some wached variables or other indications.

It could be with formatting (HTML, colors, font-size) and emphasize or deemphasize this or that.

It could also be interactive - currently observed segment is drawn like this, say 1000 or 10000 lines around.

This may allow us to detect visually certain error patterns or anomalies.

Tracking tuple-list structures

I'll try pprint or look for other tools, which automatically unfold nested structures when printing.

Actually, I meant manual unfolding, as in pycharm variables. My patterns / blobs have hierarchical structure, I want be able to click on higher levels to see lower-level details selectively, or just skip them. I just added a simple try-except to frame_dblobs:

try: if fork[0][0][5] == 1: # _fork roots, see ln161, never == 1? blob_seg = form_blob_seg(fork[0], _P, _x) # _P (y-2) is packed in fork[0] blob segment + _fork (y-3) ini = 0 # no blob segment initialization return ini, blob_seg except: break

and put a breakpoint before "break", that preserved variable values. That's ok for now, I just didn't know about try-except :).

However, if you mean the problem of losing the variable identifiers during tuple packing?

I think of one solution, while still keeping tuples (your choice).

Maybe it'd be easier if using class objects, because they have a type and their variables are named, but I'm not sure about that, because you're right that it adds clutter.

With just tuples, I'd traverse the code sequentially as executed and would find the assignment operations and append and pop.

Parsing is required, but I developed one in a couple of days in a previous CogAlg session, then discovered that Python itself has functons for that and for compiling. So that kind of parsing is not complex.

The Debug-each-line thing may help by searching for differences between current and previous states.

Say:

blob_seg = _P, [_P], ave_x, 0, fork, len(root_) => blob_seg = _P, [_P], ave_x, 0, fork, len(root_), GetID(161)

buff_.append((_P, _x, fork, root)) ==> buff.append((_P, _x, fork, root_, GetID(149))

The simplest return value for "GetID" could be just an identifier or a number which is mapped to the different types of variables.

Thus the types also would have to be enumerated and classified, so that they know what they expect.

When reading those modified tuples back, the left hand side would have to be modified accordingly, too, and we shouldn't miss an assignment, because it would break the mapping.

_P, _x, fork, root_, ID = buff.popleft()

Then the Debug part could use the ID as a type-identifier and print the respective tuples with the respective labels: x, m, d ... Or more information, if we pack the ID to point to somewhere else.

(You once mentioned about tracking the object IDs in Pycharm, I think that's their address, it's what is printed by default if a class object is fed to print(): <main.dP_Class object at 0x7fa247bddc50>

However inside the constructed tuples there would be just the id of the compound object.)

Thanks, that could help in the future

In general, code without explicit complex named data structures may seem more elegant for nesting etc., but we see the drawbacks.

I will add some labels in the comments, as you suggested. Making them explicit is a case-by-case thing. Sorry, I am a bit distracted, still figuring out comparison by division.

Another solution could be to do a more complex code transformation:

Annotation of where the data types are defined - the first assignments.

Then automatic generation of classes, based on that structure.

Then when using the respective structure - using a class object, instead of a tuple.

With proper comments, it could be automated:

E.g.

dP = 0, 0, 0, 0, 0, 0, [] # lateral difference pattern = pris, I, D, Dy, V, Vy, ders2

class dP_Class: init(s, pris, I, D, Dy, V, Vy, ders2): #s - self ... s.pri_s = pris s.I = I s.D = D s.Dy = Dy s.V = V s.Vy = Vy s.ders2 = ders2_

Then:

dP = dP_Class(0,0,0,0,0,0,[])

Then when needed for printing, using reflection functions (they could be syntactically sophisticated, though): https://stackoverflow.com/questions/9058305/getting-attributes-of-a-class

Or using our class definitions (with simple parsing of the constructor header: s, pri_s, I, D, Dy, V, Vy, ders2) in order to print labels.

...

BTW, regarding classes, I overcomplicated it - complex reflection is not needed. We know the members, thus a simple function that prints the items, defined in each class, possibly generated by the class constructor parameters list or manual, they are a few, would do the job.

I know the alg. doesn't need classes for now but I speculate that with the progress of understanding and knowing better what it's doing, they may happen to be handy. Including about self-modifying code, adding or removing members from the classes in run time.

I don't know, it seems that classes are mostly useful for linking disparate programs, and cogalg is self-contained. Anyway, there are better things to focus on for now.

Twenkid commented 6 years ago

Actually, I meant manual unfolding, as in pycharm variables. My patterns / blobs have hierarchical structure, I want be able to click on higher levels to see lower-level details selectively, or just skip them.

I understood that, as well, yes. It could be done. The default GUI component for that is called : TreeView

It could be done in Python, too, either with a tree view from some GUI library or with a simpler custom component which renders the GUI in, say, OpenCV, it doesn't have to be visually polished.

However you obviously need to know the structure of the tree, before rendering it correctly...

It would open in a separate window. It might be useful also to click on the input image and see respective patterns. Eventually - to make selections and run something on them, make summaries, collect items aside and then compare them etc. ...

We can automatically parse "some" trees, yet being uninformed, because the nested structures are trees - based on parentheses and brackets, - but it doesn't suggests the types and the labels by default and whether a list or a tuple is a leaf (same-level), or a branch (new level). Also, that may require to traverse the structure, after first printing it somewhere (like print or pprint).

The types within the blob could be "guessed" based on knowledge about the algorithm, the length of the tuples, which are lists, where are borderlines between tuples and lists etc., but that's rather unnecessary overhead since it could "know" them exactly.

As of types, I think I've suggested named tuples as alternatives? (for having identifiers), but you've skipped them?

https://docs.python.org/2/library/collections.html#collections.namedtuple

BTW, this discussion suggests a huge memory usage. About 10M of the simplest two-tuples eat up about 1G RAM for unique objects and 120MB in case of just 10M references to the same object.

https://stackoverflow.com/questions/45123238/python-class-vs-tuple-huge-memory-overhead


> I just added a simple try-except to frame_dblobs:

try:
    if _fork_[0][0][5] == 1:  # _fork roots, see ln161, never == 1?
        blob_seg = form_blob_seg(_fork_[0], _P, _x)  # _P (y-2) is
packed in _fork_[0] blob segment + __fork_ (y-3)
        ini = 0  # no blob segment initialization
        return ini, blob_seg
except:
    break

and put a breakpoint before "break", that preserved variable values. That's ok for now, I just didn't know about try-except :).

Good that you're learning. You mean you're now watching it online in the IDE?

Usually it's desirable the exception to have some processing (exception handling), at least print or logging, and not just skipping the error, but if you know what it's returning already and it's a quick temporary solution, that's OK.

I don't know, it seems that classes are mostly useful for linking disparate programs, and cogalg is self-contained. Anyway, there are better things to focus on for now.

It's self-contained, but the individual operations, data and functions are discontinuous as well and the practical programs are more than the core algorithm. Especially during development.

OOP allows operator overloading. I guess you wouldn't want that for now, it may be confusing, but again - in long term, as understanding and generalization grows, it may get more convenient and allow more concise notation within the code.

OOP could be useful for exception handling like to pack info about the exceptions. OOP also keeps closer together code and data, which is convenient, allow generalization and reduction of code duplication; it allows control of the way data is accessed (encapsulation), easy extension on a single site etc.

There would be an overhead, though in the already memory-hungry Python.

But once that the algorithm is clear and debugged well enough, it would be ported anyway.

Twenkid commented 6 years ago

(Oh, I thought it didn't sent - so this is a shorter version of the above comment... :) )

Actually, I meant manual unfolding, as in pycharm variables. My patterns / blobs have hierarchical structure, I want be able to click on higher levels to see lower-level details selectively, or just skip them.

I know that you meant that, as well, the GUI element is called TreeView. It could be done "locally" in Python as well with a GUI library or with a custom simple control, drawing with OpenCV for example.

However you need to know the structure of the tree, which element is a leaf and which is a branch and where each ends - without types it would be ambiguous. Some tree could be parsed and built, based on the ( ) [], but the types shoud be inferred somehow - again it's possible by some knowledge of the lengths and structure of the used tuples, but it's explicit in their initializations and in assignments, which have to be mapped.

I just added a simple try-except to frame_dblobs:

try: if fork[0][0][5] == 1: # _fork roots, see ln161, never == 1? blob_seg = form_blob_seg(fork[0], _P, _x) # _P (y-2) is packed in fork[0] blob segment + _fork (y-3) ini = 0 # no blob segment initialization return ini, blob_seg except: break

and put a breakpoint before "break", that preserved variable values. That's ok for now, I just didn't know about try-except :).

Good!

I don't know, it seems that classes are mostly useful for linking disparate programs, and cogalg is self-contained. Anyway, there are better things to focus on for now.

OOP has many other usages, e.g. operator overloading, generalization of code and reduction of code duplication, extension of classes (in Python - in runtime as well) and making the code more compact or/and readable on the site of accessing the functions, encoded in the object-oriented part.

OOP also divides the complexity between subsystems and is for making explicit hierarchical systems.

...

BTW, if I'm not mistaken, I've suggested the named tuples in the past, but you've dismissed them? They lay in between classes and tuples: https://docs.python.org/2/library/collections.html#collections.namedtuple

boris-kz commented 6 years ago

OOP has many other usages, e.g. operator overloading, generalization of code and reduction of code duplication, extension of classes (in Python - in runtime as well) and making the code more compact on the site of accessing the functions, encoded in the object-oriented part.

This is all case-specific, understanding the code comes first.

BTW, if I'm not mistaken, I've suggested the named tuples in the past, but you've dismissed them?

Yes, it seemed unnecessary then. I may add them to blobs if they get too complex to recognize vars by position. Positional tracking is easier for me, but you can name them for yourself. So, back to the main bug: the blobs don't accumulate because if fork[0][0 ][5] == 1: is never true. I need to refresh on how root_ is built and used, I may have made mistakes there.

Twenkid commented 6 years ago

So my next task will be to apply a hierarchical/Tree view GUI for the output, for a start without labels. I'm a bit tired lately, so if don't manage to complete it to a milestone tomorrow, I may have a little break for a few days and will show something probably next week.

boris-kz commented 6 years ago

if you mean the problem of losing the variable identifiers during tuple packing?

I think of one solution, while still keeping tuples (your choice).

Maybe it'd be easier if using class objects, because they have a type and their variables are named, but I'm not sure about that, because you're right that it adds clutter.

With just tuples, I'd traverse the code sequentially as executed and would find the assignment operations and append and pop.

Parsing is required, but I developed one in a couple of days in a previous CogAlg session, then discovered that Python itself has functons for that and for compiling. So that kind of parsing is not complex.

The Debug-each-line thing may help by searching for differences between current and previous states.

Say:

blob_seg = _P, [_P], ave_x, 0, fork, len(root_) => blob_seg = _P, [_P], ave_x, 0, fork, len(root_), GetID(161)

buff_.append((_P, _x, fork, root)) ==> buff.append((_P, _x, fork, root_, GetID(149))

The simplest return value for "GetID" could be just an identifier or a number which is mapped to the different types of variables.

Thus the types also would have to be enumerated and classified, so that they know what they expect.

When reading those modified tuples back, the left hand side would have to be modified accordingly, too, and we shouldn't miss an assignment, because it would break the mapping.

_P, _x, fork, root_, ID = buff.popleft()

Then the Debug part could use the ID as a type-identifier and print the respective tuples with the respective labels: x, m, d ... Or more information, if we pack the ID to point to somewhere else.

( You once mentioned about tracking the object IDs in Pycharm, I think that's their address, it's what is printed by default if a class object is fed to print(): <main.dP_Class object at 0x7fa247bddc50>

However inside the constructed tuples there would be just the id of the compound object. )

Interesting... That id loss might be why my blobs don't accumulate.

In ln145, fork_.append([]), that empty list is only to reserve an object id for _P, which is replaced by a blob on the next scan line.

Future blob will replace _P by pop + appended (ln175, ln176), vs. by assignment, because that's supposed to preserve object id.

But that would only happen on the next line, after P.append((P, x, fork)) on ln179.

So, fork_ is packed into tuple, and you are saying that destroys all internal IDs?

And that can be prevented by naming the elements?

So my next task will be to apply a hierarchical/Tree view GUI for the

output, for a start without labels. I'm a bit tired lately, so if don't manage to complete it to a milestone tomorrow, I may have a little break for a few days and will show something probably next week.

That seems redundant for now, online pycharm does the job.

Could you work on converting to named tuples instead, if that preserves object ID?

That would also help you to understand the code, and see which technical implementations are helpful?

Thanks!

In general, code without explicit complex named data structures may seem more elegant for nesting etc., but we see the drawbacks. Classes deal with the "type-stuff" automatically, but are more cluttered.

Another solution could be to do a more complex code transformation:

Annotation of where the data types are defined - the first assignments.

Then automatic generation of classes, based on that structure.

Then when using the respective structure - using a class object, instead of a tuple.

With proper comments, it could be automated:

E.g.

dP = 0, 0, 0, 0, 0, 0, [] # lateral difference pattern = pris, I, D, Dy, V, Vy, ders2

class dP_Class: init(s, pris, I, D, Dy, V, Vy, ders2): #s - self ... s.pri_s = pris s.I = I s.D = D s.Dy = Dy s.V = V s.Vy = Vy s.ders2 = ders2_

Then:

dP = dP_Class(0,0,0,0,0,0,[])

Then when needed for printing, using reflection functions (they could be syntactically sophisticated, though): https://stackoverflow.com/questions/9058305/getting-attributes-of-a-class

Or using our class definitions (with simple parsing of the constructor header: s, pri_s, I, D, Dy, V, Vy, ders2) in order to print labels.

...

I'd choose the tuple-extension for now. It seems easier and easier to debug.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/boris-kz/CogAlg/issues/8#issuecomment-420603042, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAXGcscDwlqQ8__d-NknAem_D7-q57fks5uaOakgaJpZM4WWgpX .

Twenkid commented 6 years ago

( You once mentioned about tracking the object IDs in Pycharm, I think that's their address, it's what is printed by default if a class object is fed to print(): <main.dP_Class object at 0x7fa247bddc50>

However inside the constructed tuples there would be just the id of the compound object. )

Interesting... That id loss might be why my blobs don't accumulate.

I assume you refer to id(variable)?

Another id is is the memory address in Pycharm <memory at 0x0000022BD96C7648>

charm_address

If the tuples are added without being divided in parts and then recollected, they may preserve their ids, but if you read the values and then recollect them : a,b,c = _P.popleft; _P.append(a,b,c) it's supposed to be another object.

It's visible in this simple test:

a = 1,2,3 d.append(a) d [(1, 2, 3)] id(d) 1737922795976 id(a) 1737910335096 id(d[0]) 1737910335096 #The tuple preserves its ID

x1, y1, z1 = d.pop() d.append( (x1,y1,z1) ) #The tuple is recreated id(d[0]) 1737922805264 #It's now a new object

In ln145, fork_.append([]), that empty list is only to reserve an object id for _P, which is replaced by a blob on the next scan line.

Future blob will replace _P by pop + appended (ln175, ln176), vs. by assignment, because that's supposed to preserve object id.

But that would only happen on the next line, after P.append((P, x, fork)) on ln179.

So, fork_ is packed into tuple, and you are saying that destroys all internal IDs?

And that can be prevented by naming the elements?

So my next task will be to apply a hierarchical/Tree view GUI for the output, for a start without labels. I'm a bit tired lately, so if don't manage to complete it to a milestone tomorrow, I may have a little break for a few days and will show something probably next week. That seems redundant for now, online pycharm does the job.

Could you work on converting to named tuples instead, if that preserves object ID? That would also help you to understand the code, and see which technical implementations are helpful?

OK, if there's a reallocation, I don't know would it help, it'd be a new object. But it could help if I do as with the tuple extension suggestion - by adding an additional variable to each tuple, called "id". When an object is recreated, it would copy that stable id, while the Python's internal one would change.

The assignment would be again something like GetID() in order to get it from a generator - the simplest is a counter, and the number would be mapped at least to a type (in case of named tuple that won't be necessary).

...

As of the Tree View - I've used that control and ironically I have drawn a similar icon like the Pycharm's one...

assistant_treeview_callgraph

boris-kz commented 6 years ago

Interesting... That id loss might be why my blobs don't accumulate.

I assume you refer to id(variable)?

Yes.

the tuples are added without being divided in parts and then recollected,

they may preserve their ids, but if you read the values and then recollect them : a,b,c = _P.popleft; _P.append(a,b,c) it's supposed to be another object.

It's visible in this simple test:

a = 1,2,3 d.append(a) d [(1, 2, 3)] id(d) 1737922795976 id(a) 1737910335096 id(d[0]) 1737910335096 #The tuple preserves its ID

x1, y1, z1 = d.pop() d.append( (x1,y1,z1) ) #The tuple is recreated id(d[0]) 1737922805264 #It's now a new object

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff_.append([_P, _x, fork, root]) ln179: P.append([P, x, fork_]) # P with no overlap to next _P is buffered for next-line scanP, via y_comp But if fork[0][0][5] == 1: is still always false

In ln145, fork_.append([]), that empty list is only to reserve an object id

for _P, which is replaced by a blob on the next scan line.

Future blob will replace _P by pop + appended (ln175, ln176), vs. by assignment, because that's supposed to preserve object id.

But that would only happen on the next line, after P.append((P, x, fork)) on ln179.

So, fork_ is packed into tuple, and you are saying that destroys all internal IDs?

And that can be prevented by naming the elements?

So my next task will be to apply a hierarchical/Tree view GUI for the output, for a start without labels. I'm a bit tired lately, so if don't manage to complete it to a milestone tomorrow, I may have a little break for a few days and will show something probably next week. That seems redundant for now, online pycharm does the job.

Could you work on converting to named tuples instead, if that preserves object ID? That would also help you to understand the code, and see which technical implementations are helpful?

OK, if there's a reallocation, I don't know would it help, it'd be a new object. But it could help if I do as with the tuple extension suggestion - by adding an additional variable to each tuple, called "id". When an object is recreated, it would copy that stable id, while the Python's internal one would change.

The assignment would be again something like GetID() in order to get it from a generator - the simplest is a counter, and the number would be mapped at least to a type (in case of named tuple that won't be necessary).

Good, that could help tracking,

Twenkid commented 6 years ago

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff_.append([_P, _x, fork, root]) ln179: P.append([P, x, fork_]) # P with no overlap to next _P is buffered for next-line scanP, via y_comp But if fork[0][0][5] == 1: is still always false

Of the elements - then that may add requirement for an id per element, not just per tuple, thus making even the single elements tuples.

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff_.append([_P, _x, fork, root]) ln179: P.append([P, x, fork_]) # P with no overlap to next _P is buffered for next-line scanP, via y_comp But if fork[0][0][5] == 1: is still always false

OK. Is the error still that fork is empty? If so, then just none of these lists and items exist, neither fork[0], from there on the sub-patterns or their parts, so these lists are not created and filled with data.

Something in blob, blob_seg, termblob, ... family of functions, because I see fork is loaded from there?

I don't see why the error should have appeared due to different ids - there are no checks of the id in the code? You assign and compare values?

Why do you think it could be because of the ids - something that you're watching in the pycharm debugger?

In ln145, fork_.append([]), that empty list is only to reserve an object id

OK, it does for the container.

Twenkid commented 6 years ago

Boris, I invented something else which seemed easier to do for a start before extending the tuples - a Read-eval-print loop that's entered after an exception and the execution is applied over selected saved state.

We can add and execute code of any complexity by exec().

From simple printing of selected variable/s to modifying them, to functions.

Eventually going back to the function that failed and continuing with those changed values.

https://github.com/Twenkid/CogAlg/commit/e9d7b137f135b36bbf1a3263e3d6a6ac0d1dbccd


localVars
frame
_P_
_buff_
P_
P
x
_ix
ix
fork_
buff_
root_
_fork_
_x
_P
Enter python line (Ctrl-C for exit)... print(x, _ix)
822 810
====
Directly over the saved states
822 810
Enter python line (Ctrl-C for exit)... if (x>_ix): print("x is bigger than _ix", x, _ix)
x is bigger than _ix 822 810
====
Directly over the saved states
x is bigger than _ix 822 810
Enter python line (Ctrl-C for exit)... x = 823
====
Directly over the saved states
Enter python line (Ctrl-C for exit)... print(x)
823
====
Directly over the saved states
823
Enter python line (Ctrl-C for exit)... print(buff_)
deque([])
====
Directly over the saved states
deque([])
Enter python line (Ctrl-C for exit)... print(root_)
[[], [], []]
====
Directly over the saved states
[[], [], []]
Enter python line (Ctrl-C for exit)... print(_P)
(0, 449, -131, -106, -2078, -1934, [(69, 0, -242, 5, -240), (70, -10, -261, 0, -238), (70, -23, -290, -10, -250), (68, -35, -316, -23, -274), (64, -37, -327, -33, -301), (57, -24, -318, -30, -317), (51, -2, -324, -15, -314)])

boris-kz commented 6 years ago

Of the elements - then that may add requirement for an id per element, not just per tuple, thus making even the single elements tuples.

Every object already has an id, and everything but integers is an object, including lists fork and root.

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff.append([ P, x, fork, root]) ln179: P.append([P, x, fork]) # P with no overlap to next P is buffered for next-line scan_P, via y_comp But if fork[0][0][5] == 1: is still always false

OK. Is the error still that fork is empty?

No, but len(root) and corresponding fork[0][0][5] are always > 1 I have an idea why, I think I messed-up with cross-reference between fork and root_, in ln145 and ln146. Still working it out.

I don't see why the error should have appeared due to different ids - there

are no checks of the id in the code? You assign and compare values?

Why do you think it could be because of the ids - something that you're watching in the pycharm debugger?

ID defines a unique object, in this case fork and root. Which connect patterns between lines: lower-line Ps to higher-line _Ps.

That reference is by object id rather than variable names because id is unique but the same var names apply to many patterns.

So, the only way _P can be included in the right blob is by id of that blob. If id changes, then blobs won't accumulate: their _Ps won't find them.

Boris, I invented something else which seemed easier to do as a start as

converting all tuples - a Read-eval-print loop that's entered after an exception and the execution is applied over selected saved state.

We can add and execute code of any complexity by exec().

From simple printing of selected variable/s to modifying them, to functions.

Eventually going back to the function that failed and continuing with those changed values.

I don't know, Todor, this seems unnecessary at the moment.

The tuples can be converted to lists, and pycharm can do all the visualization that I need.

My problem is not tinkering with parameters, functions fail because they are not designed right.

I would much rather if you worked on the code, otherwise you won't know what tools are needed.

Twenkid commented 6 years ago

(Preliminary answer, don't have the code at hand right now)

Of the elements - then that may add requirement for an id per element, not just per tuple, thus making even the single elements tuples. Every object already has an id, and everything but integers is an object, including lists fork and root.

Well, int-s also have to have id-s in Python, because everything in Py is an object of a class.

However in your code some of the types are actually numpy arrays, even with one element, because they are produced from the input image, which is a numpy array.

You could see that in pycharm debugger, the fields are too deep, I noticed a "dtype" element.

The elements in Numpy arrays are packed and you may have two indices that point to the same id. With just "print" they look like normal lists and can be accessed as normal lists with [].

(Tested: https://www.pythonanywhere.com/try-ipython/  )
In [2]: import nump
y as np
In [3]: a = np.array([1,2,3,4,5])
In [4]: a
Out[4]: array([1, 2, 3, 4, 5])
In [5]: id(a)
Out[5]: 139810716245168
In [6]: id(a[0])
Out[6]: 139810966726768
In [7]: id(a[1]
   ...: )
Out[7]: 139810966726792
In [8]: id(a[2])
Out[8]: 139810966726792
print(a)
[1 2 3 4 5]

Maybe you could try converting it to list, I think I've tried this, but it was long time ago and it's supposed to be a performance hit. https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tolist.html

a = np.array([1,2])
a.tolist()
[1, 2]

>>> i = 123
>>> id(i)
93863562748832
>>> j = 235
>>> id(j)
93863562752416
>>> k = []
>>> k.append(i); k.append(j)
>>> k
[123, 235]
>>> id(k[0])
93863562748832
>>> id(k[1])
93863562752416

Lists created like [] should be just lists, but the elements inside could be numpy arrays. I'll check the types more thoroughly later, or you could see them yourself in the debugger.

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff.append([ P, x, fork, root]) ln179: P.append([P, x, fork]) # P with no overlap to next P is buffered for next-line scan_P, via y_comp

But if fork[0][0][5] == 1: is still always false OK. Is the error still that fork is empty? No, but len(root) and corresponding fork[0][0][5] are always > 1 I have an idea why, I think I messed-up with cross-reference between fork and root_, in ln145 and >ln146. Still working it out.

Why do you think it could be because of the ids - something that you're watching in the pycharm debugger? ID defines a unique object, in this case fork and root. Which connect patterns between lines: lower-line Ps to higher-line _Ps. That reference is by object id rather than variable names because id is unique but the same var names apply to many patterns. So, the only way _P can be included in the right blob is by id of that blob. If id changes, then blobs won't accumulate: their _Ps won't find them.

a Read-eval-print loop ... I don't know, Todor, this seems unnecessary at the moment. The tuples can be converted to lists, and pycharm can do all the visualization that I need. My problem is not tinkering with parameters, functions fail because they are not designed right. I would much rather if you worked on the code, otherwise you won't know what tools are needed.

OK. I see, but it may be useful for me for studying the code.

Twenkid commented 6 years ago

on the id test.

Here numpy arrays are printed without commas.


In [22]: a = np.array([1,2,3,4,5])
In [23]: c = a[2]
In [24]: print(id(c))
139810966726816
In [25]: b = a.tolist()
In [26]: print(b)
[1, 2, 3, 4, 5]
In [27]: id(a[0])
Out[27]: 139810966726792
In [28]: id(a[1])
Out[28]: 139810966726792
In [29]: id(a[2])
Out[29]: 139810966726792
In [30]: id(a[3])
In [31]: id(b[0])
Out[30]: 139810966726792

In [31]: id(b[0])
Out[31]: 38818136

In [32]: id(b[1])
Out[32]: 38818112

In [33]: id(b[2])
Out[33]: 38818088

In [34]: id(b[3])
Out[34]: 38818064

boris-kz commented 6 years ago

On Sat, Sep 15, 2018 at 5:18 AM Todor Arnaudov notifications@github.com wrote:

(Preliminary answer, don't have the code at hand right now)

Of the elements - then that may add requirement for an id per element, not just per tuple, thus making even the single elements tuples. Every object already has an id, and everything but integers is an object, including lists fork and root.

Well, int-s also have to have id-s in Python, because everything in Py is an object of a class.

Ok. I remember reading that integers are not objects in Python, but that seems wrong.

However in your code some of the types are actually numpy arrays, even

with one element, because they are produced from the input image, which is a numpy array.

You could see that in pycharm debugger, the fields are too deep, I noticed a "dtype" element.

The elements in Numpy arrays are packed and you may have two indices that point to the same id. With just "print" they look like normal lists and can be accessed as normal lists with [].

(Tested: https://www.pythonanywhere.com/try-ipython/ ) In [2]: import nump``` y as np In [3]: a = np.array([1,2,3,4,5]) In [4]: a Out[4]: array([1, 2, 3, 4, 5]) In [5]: id(a) Out[5]: 139810716245168 In [6]: id(a[0]) Out[6]: 139810966726768 In [7]: id(a[1] ...: ) Out[7]: 139810966726792 In [8]: id(a[2]) Out[8]: 139810966726792 print(a) [1 2 3 4 5]

Maybe you could try converting it to list, I think I've tried this, but it was long time ago and it's supposed to be a performance hit.https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tolist.html

a.tolist() [1, 2]

i = 123 id(i) 93863562748832 j = 235 id(j) 93863562752416 k = [] k.append(i); k.append(j) k [123, 235] id(k[0]) 93863562748832 id(k[1]) 93863562752416

Lists created like [] should be just lists, but the elements inside could be numpy arrays.

What difference does it make? I only care about id of new lists and tuples

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff.append([ P, x, fork, root]) ln179: P.append([P, x, fork]) # P with no overlap to next P is buffered for next-line scan_P, via y_comp

But if fork[0][0][5] == 1: is still always false OK. Is the error still that fork is empty? No, but len(root) and corresponding fork[0][0][5] are always > 1 I have an idea why, I think I messed-up with cross-reference between fork and root_, in ln145 and >ln146. Still working it out.

Why do you think it could be because of the ids - something that you're watching in the pycharm debugger? ID defines a unique object, in this case fork and root. Which connect patterns between lines: lower-line Ps to higher-line _Ps. That reference is by object id rather than variable names because id is unique but the same var names apply to many patterns. So, the only way _P can be included in the right blob is by id of that blob. If id changes, then blobs won't accumulate: their _Ps won't find them.

a Read-eval-print loop ... I don't know, Todor, this seems unnecessary at the moment. The tuples can be converted to lists, and pycharm can do all the visualization that I need. My problem is not tinkering with parameters, functions fail because they are not designed right. I would much rather if you worked on the code, otherwise you won't know what tools are needed.

OK. I see, but it may be useful for me for studying the code.

Or you can use pycharm. No need to reinvent the wheel.

Thanks for your help Todor!

I just sent you $1000.00 by PayPal, for 1st half of September.

Twenkid commented 6 years ago

Ok. I remember reading that integers are not objects in Python, but that seems wrong.

I think that was true for Java and C#, besides classes they had also "primitive types" for performance reasons.

What difference does it make? I only care about id of new lists and tuples

Didn't you ask about their elements?:

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff_.append([ *P, x, fork, root]) (...)

OK. I see, but it may be useful for me for studying the code. Or you can use pycharm. No need to reinvent the wheel.

You're right if the goal was just to manually watch the variables in slow step-by-step runs and if one would try to copy the entire Pycharm with all the menus etc. That's too much and unnecessary, I wouldn't do it.

I want to automate operations, create and generate more interesting ones and make existing ones faster, less tedious and easier, such as the manual debugging

The last feature I demonstrated for example could be extended for something else, like temporary code injection and eventually for runtime adaptable code generation, with the respective meta-structures.

The demands would arise on their own while the understanding of the code gets better and such little ideas will combine.

Thanks for your help Todor! I just sent you $1000.00 by PayPal, for 1st half of September.

Thanks, a good start for a new development and prize session!

boris-kz commented 6 years ago

Ok. I remember reading that integers are not objects in Python, but that

seems wrong.

I think that was true for Java and C#, besides classes they had also

"primitive types" for performance reasons.

Yes, I probably remember it from C#.

What difference does it make? I only care about id of new lists and tuples

Didn't you ask about their elements?:

I meant id of tuple elements. Actually, using lists instead of tuples should preserve them. I just did that: ln149: buff_.append([ *P, x, fork, root]) (...)

Yes, the elements that are lists and tuples.

OK. I see, but it may be useful for me for studying the code. Or you can use pycharm. No need to reinvent the wheel.

You're right if the goal was just to manually watch the variables in slow step-by-step runs and if one would try to copy the entire Pycharm with all the menus etc. That's too much and unnecessary, I wouldn't do it.

I want to automate operations, create and generate more interesting ones and make existing ones faster, less tedious and easier, such as the manual debugging

The last feature I demonstrated for example could be extended for something else, like temporary code injection and eventually for runtime adaptable code generation, with the respective meta-structures.

The demands would arise on their own while the understanding of the code gets better and such little ideas will combine.

I think this is just another excuse for you to avoid working on a code. What you really need at this point is imagination and communication.

Thanks, a good start for a new development and prize session!

Lets hope this one lasts :).

Twenkid commented 6 years ago

Yes, the elements that are lists and tuples.

ОК

What you really need at this point is imagination and communication.

The latter depends also on your behavior, you know. Lately you behave well, thanks.

Thanks, a good start for a new development and prize session! Lets hope this one lasts :).

Yes. :)

I think this is just another excuse for you to avoid working on a code.

Well - if it means avoiding the same way and with the same limitations you impose to yourself while working, "manually" etc. - maybe yes. I know, that before moving forward I'll have to fit in it, but the tools may help me in the way of decoding it, like taking interactive notes. Some tools may make it easier for me and for other developers.

Also I think you've got a wrong expectations that all of the ideas are huge projects with 99999 lines of code and extensive design process. In fact some of them could be developed in an hour, a few hours or days, depending on motivation and unexpected obstacles, and then when combined provide new insights.

I'd better skip the details for now and show when the time comes - as with the logs which you dismissed as wrong, but they happened to be more helpful than just the online debugger.

boris-kz commented 6 years ago

What you really need at this point is imagination and communication.

The latter depends also on your behavior, you know.

Which depends on you staying on topic. Ok, I tend to be too abrasive, but... no one is perfect.

Lately you behave well, thanks.

Because you contribute :).

I'd better skip the details for now and show when the time comes - as with

the logs which you dismissed as wrong, but they happened to be more helpful than just the online debugger.

Not really, online debugger is still more natural, it just needed try-except-break. And it can track excessive len(root_) with id(), but I am revising the whole scheme right now.

Twenkid commented 6 years ago

На пн, 17.09.2018 г., 4:24 Boris Kazachenko notifications@github.com написа:

I'd better skip the details for now and show when the time comes - as with

the logs which you dismissed as wrong, but they happened to be more helpful than just the online debugger.

Not really, online debugger is still more natural, it just needed try-except-break. And it can track excessive len(root_) with id(), but I am revising the whole scheme right now.

What is excessive len(root) - root with an excessive length? If you track it manually, can you point where exactly the id changes abruptly while it had to be the same?