boris-kz / CogAlg

This project is a Computer Vision implementation of general hierarchical pattern discovery principles introduced in README
http://www.cognitivealgorithm.info
MIT License
91 stars 41 forks source link

Remove second new_dert_ = [] in unfold? #27

Closed Twenkid closed 4 years ago

Twenkid commented 5 years ago

In unfold, isn't that line forgotten to be removed, it is zeroing the data collected in the seg... traversal in unfold, right before the sort:

new_dert_ = []

https://github.com/boris-kz/CogAlg/blob/1ddad1e979a0f0927638dff00c3a0144b40617b8/frame_2D_alg/intra_comp_debug.py#L91

Twenkid commented 5 years ago

OK, I noticed it's outdated here and in Khan's repository it's fixed by copying to a buffer before the [], but it seems new_dert__ is new_dert_ referring to the code in the function (new_dert__ is not defined there). But I understand it is changing quickly, so this is possibly fixed in a version that's not uploaded yet.

@khanh93vn Khan, would you give me rights to comment/add issues in your repository or some contacts to open discussions? Thanks!


    new_dert__.sort(key=lambda dert: dert[1])    # sorted by x coordinates
    new_dert__.sort(key=lambda dert: dert[0])    # sorted by y coordinates

        dert_buffer_ = new_dert_

        new_dert_ = []
khanh93vn commented 5 years ago

okay, my apologies for the confusion again. Results from comps are per comp per dert and are unstructured so it is sorted so that results with the same x, y would be contiguous, they'll then be merged. ncomp is the number of results with the same x, y (same pixel).

How to give you rights to comment/add issues? Ok I'll find out.

Twenkid commented 5 years ago

Thanks, I joined. I notice you've just removed the content of current unfold- content, you will work on it offline? I will study an older version later today.

As of the identifier ncomp - what about nmatch, nolp, nover(lap) etc. It doesnt matter when one operates the code and sees what it does, but IMO mnemonically "comp" is confusing, because it is used as a keyword for the family of comparison functions.

Similarly to the transition of small "p" for pixel in the first comparison, which then changes to "i" (input) in order not to be confused with the "P"-family.

On Monday, April 8, 2019, Khanh Nguyen notifications@github.com wrote:

okay, my apologies for the confusion again. Results from comps are per comp per dert and are unstructured so it is sorted so that results with the same x, y would be contiguous, they'll then be merged. ncomp is the number of results with the same x, y (same pixel).

How to give you rights to comment/add issues? Ok I'll find out.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/boris-kz/CogAlg/issues/27#issuecomment-480654395, or mute the thread https://github.com/notifications/unsubscribe-auth/AWSP2B9S6SJsbjPj6T71F5UnbrZW82Vbks5vep1LgaJpZM4cg2UW .

khanh93vn commented 5 years ago

Hi. I and Boris had a discussion earlier and we decided unfold and comp will be performed accordingly to structure of blob, segment and P. Type of branch is determined in intra_blob. intra_comp will have the same unfolding, with branch-specific operations passed as function-pointers.

Branch-specific operations compare functions, with the exception of hypot_g, which will only recompute g from dx and dy.

We also decided that the exclusive module for adjustable parameters (filter.py) be removed. So I cleaned up the repo, removing outdated parts. Well, mostly

Twenkid commented 5 years ago

What do you mean by compare functions? (I got the func.pointers)

BTW, one semantics bug regarding code I saw in: https://github.com/khanh93vn/CogAlg/blob/master/frame_2D_alg/intra_comp.py

The container that is being traversed with for...in iteration should not be altered during the traversal!

We should watch this!

(The small tests with simpler structure supported it, AFAIK) https://github.com/Twenkid/CogAlg/blob/master/forin.py

If an element is removed, the length of the structure is modified which I think causes elements to be skipped, the actual list gets shorter.

For that logic and if the changes should be done in-place (no new structure for seg), I suggest using two separate iterations: the first appends the selected elements to P, then the second removes seg from seg with the opposite condition.

unfold_blob:

        P_ = []                             # buffer for Ps at line y
        for seg in seg_:
            if y < seg[0] + seg[1][0]:      # y < y0 + Ly (y within segment):

                P_.append(seg[2][y - seg[0]])   # append P at line y of seg
            else:                           # y >= y0 + Ly (out of segment):
                seg_.remove(seg)

Maybe:

        P_ = []                             # buffer for Ps at line y
        for seg in seg_:
            if y < seg[0] + seg[1][0]:      # y < y0 + Ly (y within segment):
                P_.append(seg[2][y - seg[0]])   # append P at line y of seg

         for seg in seg_:
            if y >= seg[0] + seg[1][0]:
                seg_.remove(seg)

In a simple test:

def seg3(): #Separate traversals
    P_ = []                            
    a = (1,2,3); b=(4,5,6); c = (7,8,9); d=(5,5,5)
    seg_ = [a,b,c,d]
    print("==== seg3 SEPARATE TRAVERSALS ====")
    print("Append if > 2");
    print("Remove if <= 2");
    print("In separate traversals")
    seg_ = [a,b,c,d]
    print(seg_)

    for seg in seg_:
       x,y,z = seg   
       if x > 2:
         print("O", seg)
         P_.append(seg)

    for seg in seg_:
       x,y,z = seg   
       if x <= 2:
         print("X", seg)
         seg_.remove(seg)

    print("seg_", seg_)
    print("P_", P_)

Outputs from three tests, two using the logic with removing an element in-place while traversing (the first sample container has one more element to show a longer case). Notice the reduced number of iteration due to the removal of elements, indicated in seg1.

========================
Test: Don't remove elements from a container that is being traversed with for ... in?
========================

==== seg1 ====
Append to P[]    if < 2
Remove from seg_ if >= 2
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (5, 5, 5), (9, 9, 9)]
1 [(1, 2, 3), (4, 5, 6), (7, 8, 9), (5, 5, 5), (9, 9, 9)]
O (1, 2, 3)
2 [(1, 2, 3), (4, 5, 6), (7, 8, 9), (5, 5, 5), (9, 9, 9)]
X (4, 5, 6)
3 [(1, 2, 3), (7, 8, 9), (5, 5, 5), (9, 9, 9)]
X (5, 5, 5)
seg_ [(1, 2, 3), (7, 8, 9), (9, 9, 9)]
P_ [(1, 2, 3)]
==== seg2 ====
Append to P[]    if > 2
Remove from seg_ if <= 2
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (5, 5, 5)]
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (5, 5, 5)]
X (1, 2, 3)
[(4, 5, 6), (7, 8, 9), (5, 5, 5)]
O (7, 8, 9)
[(4, 5, 6), (7, 8, 9), (5, 5, 5)]
O (5, 5, 5)
seg_ [(4, 5, 6), (7, 8, 9), (5, 5, 5)]
P_ [(7, 8, 9), (5, 5, 5)]
==== seg3 SEPARATE TRAVERSALS ====
Append if > 2
Remove if <= 2
In separate traversals
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (5, 5, 5)]
O (4, 5, 6)
O (7, 8, 9)
O (5, 5, 5)
X (1, 2, 3)
seg_ [(4, 5, 6), (7, 8, 9), (5, 5, 5)]
P_ [(4, 5, 6), (7, 8, 9), (5, 5, 5)]
khanh93vn commented 5 years ago

The container that is being traversed with for...in iteration should not be altered during the traversal!

Ah, I got that bug before. Thank you for your help!

Last time, I replaced it with a while loop, which only +1 to index when element is not removed

But I suppose this is more elegant:

I suggest using two separate iterations: the first appends the selected elements to P, then the second removes seg from seg with the opposite condition.

Which one should we choose in this case, or is their difference is not enough to even consider?

Edit: that was very detailed, thank you for the efforts.

boris-kz commented 5 years ago

About that p -> i transition, yes, I think we should keep p as a generic input, i will be reserved for indexing. Matter of fact, iP is confusing to me, it looks like type of P. I think it should b ii, iii, etc. for nested indexing, the meaning is clear from context.

Twenkid commented 5 years ago

On Wednesday, April 10, 2019, Khanh Nguyen notifications@github.com wrote:

The container that is being traversed with for...in iteration should not be altered during the traversal!

Ah, I got that bug before. Thank you for your help!

You're welcome!

Last time, I replaced it with a while loop, which only +1 to index when element is not removed

But I suppose this is more elegant:

I suggest using two separate iterations: the first appends the selected elements to P, then the second removes seg from seg with the opposite condition.

Which one should we choose in this case, or is their difference is not enough to even consider?

Yes, as of readability I'd choose the one with two iterations. In general if we manage to separate the operations I think the code would be more readable and easier to debug. I have experience with writing code parsers in C/C++ which have similar logic (conditionally skipping different number of indices for the traversing pointers, adding to containers ...) and when writing such code one tries to compress all in one cycle.

As of doing the job "while" is more powerful than "for" and allows such adaptive adjustments of the stepping, for simple traversals and with a note and example it would be fine, but may be more confusing, especially for more complex traversals.

As of iP - I agree it looks like the Pattern family. The i, ii, iii was used a long time ago? It is suggestive about the depth of nesting, but it would be readable up to 4-5 i"s" - I guess it won't be too deep like iiiiiiiii - then it will get confusing counting the i-s....

One interesting naming convention: i0i, i1i, ... i8i, i9i, i10i etc. Or i1s, i1b , i2b, ... Where s, b may have other meaning such as seg, blob etc. Although the numbers are usually not recommended as parts of identifiers (an alternative: iai, ibi, ici, idi but it varies too much).

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/boris-kz/CogAlg/issues/27#issuecomment-481547603, or mute the thread https://github.com/notifications/unsubscribe-auth/AWSP2E36XCLvuxfeNxvmJ3Hk-zbKirv5ks5vfYBkgaJpZM4cg2UW .

boris-kz commented 5 years ago

but it would be readable up to 4-5 i"s"

we will burn that bridge when we get to it :)

Twenkid commented 5 years ago

A coding/naming style proposal for comp_range and in general: (EDIT: I've written "intra_comp") https://github.com/khanh93vn/CogAlg/blob/master/frame_2D_alg/comp_range.py#L20

For iterations using len(...) as ending conditions, such as "len(dert_), len(_dert_)"

while i < len(dert_) and _i < len(_dert_): # while there's still comparison to be performed. Loop is per i

When these containers don't change, len(...) could be used as constants.
The function call len(...) instead of a read of a constant (within the function) also suggests that the value could have been modified in the mean time, the cycle could eventually end earlier than initially etc. and is confusing - similarly to the case with modifying a list in a for ... in loop.

Variants:

A) The obvious one:

len_dert_ , len__dert_ 

len_dert_  = len(dert_) 
len__dert_ = len(_dert_):

However I think the two-underscores are not pretty.

B) A prettier choice to me:

dert_len,  _dert_len
dert_len  = len(dert_) 
_dert_len = len(_dert_):

while i < dert_len and _i < _dert_len

This looks better to me also because the concepts/idents that are more important to be noticed first are two different (dert, _dert_) and not the repeating and generic "len(, len("), and obviously the counter i couldn't be compared to a container (list), thus it is a scalar.

       _lx = x - xd    # upper-left x coordinate
        _lx = x + xd    # upper-right x coordinate
       #a typo _rx = ...

That also suggests me (eventually) pre-processor shorthands or macros, like a Super-Python for code-patterns when there are many uniform or simple conditions (...and...) within ranges.

Simple operations/checks within a range etc. appear too complicated, so many low-level symbols.

This:

while _li < len(_dert_) and _dert_[_li] < _lx:  # search for the right coordinate
                    _li += 1
 if _li < len(_dert_) and _dert_[_li] == _lx:    # if coordinate is valid

(without the second if ... for now, because it includes future code)

in a Super-Python:

ScanRight(_dert_, _li, _lx)

or

ScanTo(_dert_, _li, _lx, +1)  #+1 is the step 

if _li < ... could be another macros

if Inside(_dert_, _li, _lx, incl):  #incl=True,  inclusive

Thus:

ScanTo(_dert_, _li, _lx, 1)
if Inside(_dert_, _li, _lx, incl):  #incl=True,  inclusive
   ...do-something

ScanTo(_dert_, _ri, _rx, 1)  
if Inside(_dert_, _ri, _rx, incl):
  ...do-something 

The preprocessor would generate the while _li < ... for the low-level view.

I know Boris doesn't like "blackboxing". :)

I don't think this is hiding anything, though, both code-views would be available, possibly in parallel windows.

What I like in such abbreviations and in function-call-style invocations is that the semantically important components are packed in a shorter view-range with less interruptions and less (or none) "context-switches" and "redirections" < ( and [..] ...

In the specific case, it suggests at a glance that both comparison sequences are doing the same job.

In general, as well, many of the algorithm's operations are iterations from a margin-to-margin over a container or/and checks is a coordinate within a given bounding range.

Even if you don't want to use such macros-like preprocessor in practice, I think some kind of a pseudocode which reflects such general iteration and "inside" patterns would be helpful.

khanh93vn commented 5 years ago

Those are good techniques, thanks for sharing!

Unfortunately, as long as Boris has to rely on the code for his mind-works, I don't have much choice.

Edit: You are a true Pythonista gangsta!

Twenkid commented 5 years ago

OK... The "const_len" thing idea is especially for code with many usages. If len is called/checked only once for the main loop, it's OK as it is as in intra_comp now.

while y < yn and i < len(blob.seg_):

If not useful for you or Boris, these generalization give insights to me to think in "bigger meaningful chunks" in the domain of the "grammar" of the algorithm, maybe to somebody else, too.

The reverse direction might be useful for the study as well: code analyzer that interprets the code, loops etc. (in the view as it is) and suggests what it does, in a more higher level verbal domain as in the example macros: ScanTo, ScanRight, ScanLeft, ScanUp, ScanDown SearchFor, Found (element), Select, Derive... Collect ... Unfold, ... so that the reader can reason about them in one concept (when possible).

That also goes for some of my ideas for displaying what the algorithm does, for studying and for debugging, especially when the code is yet incomplete and can't be run in a full environment with Pycharm. Something like the test with forin but more sophisticated.

E.g. selecting a chunk of the code, like a loop, and simulating the loops with sample data or displaying where it should start, where it should end etc. It could be smart enough to know the structures and provide appropriate data or show the nesting (important for debugging). It could show how the iteration goes etc. as diagrams, possibly animated.

Edit: You are a true Pythonista gangsta!

:D Thank you, Khan!