surface recognition problem

ChiwoongLEE commented 2 years ago

HI When three aoi regions defined using surface trackers are viewed in the world camera, for example, when surface1 is viewed, gaze point refers to surface1, but surface datum recognizes surface2.

In other words, if three defined surfaces are visible on one world camera, is there any way to clearly define the surface facing the gaze?

i use tag36h11 surface1: id 0,1,2,3 surface2: id 4,5,6,7 surface3: id 8,9,10,11

ChiwoongLEE commented 2 years ago

and I think it is related to two heat map modes, but I can select the mode on the pupil player, but it is impossible to select the mode on the pupil capture.

Is there a way to select the mode in the pupil capture?

papr commented 2 years ago

@ChiwoongLEE Hi! For every detected surface, the surface tracker plugin will map the main scene video gaze into each corresponding surface coordinate system. If both x and y norma pos values of a given surface-mapped gaze datum are between 0 and 1 then you know that that gaze point is on the given surface.

Check out Player's gaze_on_surface<name>.csv export files. It has a on_surf column indicating wether gaze is on a surface <name> or not.

If this did not answer your question, please provide a visualization/screenshot of the issue.

ChiwoongLEE commented 2 years ago

image url below:

https://s3.us-west-2.amazonaws.com/secure.notion-static.com/8db8d19b-c750-4f2f-986b-b214038041ba/Screenshot_from_2022-07-14_23-05-49.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIAT73L2G45EIPT3X45%2F20220714%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20220714T140635Z&X-Amz-Expires=86400&X-Amz-Signature=6632e4f1b80c74c400de78069a4f4e26d0b8ec5a64c1fdb451e5291145f08f09&X-Amz-SignedHeaders=host&response-content-disposition=filename %3D"Screenshot%2520from%25202022-07-14%252023-05-49.png"&x-id=GetObject

I checked about "on_sulf" through the csv file. However, in the world process, the gaze point is pointed to the middle screen, while the csv file recognizes the left surface.

I want to make sure that when the gaze point recognizes the center, the surface also recognizes the center.

papr commented 2 years ago

while the csv file recognizes the left surface.

Which csv file are you referring to in this case?

ChiwoongLEE commented 2 years ago

Gaze_positions_on_surface.csv.

I am currently developing a plug-in that extracts the surface datum's norm_pos and surface name in real time.

papr commented 2 years ago

In https://github.com/pupil-labs/pupil/issues/2246 you have been able to extract the norm_pos for each surface already, correct?

The only thing missing to check if gaze is on the corresponding surface is to check its values: If:

(0.0 <= norm_pos[0] <= 1.0) and (0.0 <= norm_pos[1] <= 1.0)

then the gaze point is on the surface.

Note: Since gaze points have a higher frame rate than the scene video you might encounter cases where one gaze point is on one surface, and another point on another surface, e.g. during a saccade.

I would also recommend discarding low confidence gaze points (confidence < 0.6).

ChiwoongLEE commented 2 years ago

I understood what you explained above.

What I want is to get information about surface1 when my gaze looks at surface1, but if there are three surfaces as shown in the attached screenshot, the surface is randomly recognized regardless of the point of view

ChiwoongLEE commented 2 years ago

It's part of the code under development.

try:
                    if 'gaze_on_surfaces' in surfaces[0]:          
                        if surfaces[0]['name']=='left':
                            if surfaces[0]['gaze_on_surfaces'][0]['confidence']>0.6 and surfaces[0]['gaze_on_surfaces'][0]['on_surf']==True:
                                surface_name='left'
                                norm_gp_x= str(1920*surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][0])
                                norm_gp_y= str(1080*(1-surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][1]))
                        elif surfaces[0]['name']=='mid':
                            if surfaces[0]['gaze_on_surfaces'][0]['confidence']>0.6 and surfaces[0]['gaze_on_surfaces'][0]['on_surf']==True:
                                surface_name='mid'
                                norm_gp_x= str(3840-(1920*surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][0]))
                                norm_gp_y= str(1080*(1-surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][1]))                                      
                        elif surfaces[0]['name']=='right':
                            if surfaces[0]['gaze_on_surfaces'][0]['confidence']>0.6 and surfaces[0]['gaze_on_surfaces'][0]['on_surf']==True:
                                surface_name='right'
                                norm_gp_x= str(5760-(1920*surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][0]))
                                norm_gp_y= str(1080*(1-surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][1]))

papr commented 2 years ago

Ah, there is a misunderstanding. Surfaces in surfaces are not ordered by the fact that the subject is looking at them or not. Instead, they are ordered in definition order. You are always looking at the first surface with surfaces[0] and therefore, your algorithm always returns the same surface name (if at all).

You might rather want something like this:

surface_subject_is_looking_at = []

for surface in surfaces:
    for gaze in surface["gaze_on_surfaces"]:
        if gaze['confidence'] > 0.6 and gaze['on_surf'] == True:
            gaze_time_and_surface_name = (gaze["timestamp"], surface["name"])
            surface_subject_is_looking_at.append(gaze_time_and_surface_name)

if gaze_time_and_surface_name:
    gaze_time_and_surface_name.sort()
    surface_name = gaze_time_and_surface_name[-1][1]
else:
    surface_name = None

What this does is to check all gaze points (not just the first one) if they are on their corresponding surface. If this is the case, I store the gaze timestamp and the surface name in a tuple. When you sort a list of tuples in Python it will use the first element (in our case the gaze timestamp) to determine the element order. So after sorting, the newest element will be in the last position of the list. With gaze_time_and_surface_name[-1][1], we access the last object and the second field of that object: The surface name.

So even if the subject changes their gaze from one surface to the next within a single frame, the algorithm above will make sure to return the most up-to-date surface with gaze.

ChiwoongLEE commented 2 years ago

what is mean "c.append(gaze_time_and_surface_name)" ?? I understood the code you suggested above and corrected it as follows.

c=[]
if surfaces[0]['gaze_on_surfaces'][0]['confidence']>0.6 and surfaces[0]['gaze_on_surfaces'][0]['on_surf']==True:
    gaze_time_and_surface_name=surfaces[0]['gaze_on_surfaces'][0]['timestamp'],surfaces[0]['name']#튜플로저장
    gaze_time_and_surface_name=list(gaze_time_and_surface_name)
    c.append(gaze_time_and_surface_name)
if gaze_time_and_surface_name:                            
    c.sort()                        
    surface_name=c[-1][1]                            
else:
    surface_name=None

And as a result of checking the stored csv file, there was a problem that the first surface was mainly output like last time.

papr commented 2 years ago

Ah, thanks for catching that. c is meant to be surface_subject_is_looking_at. Fixed it in the original code.

papr commented 2 years ago

And as a result of checking the stored csv file, there was a problem that the first surface was mainly output like last time.

That is expected because you are not looping over all surfaces but, again, are only looking at the first detected surface (surfaces[0]).

ChiwoongLEE commented 2 years ago

Yes, as a result of modifying the given code, there is still a problem that only the first surface 1 is output, even though there are all surfaces in the world frame. If I'm looking at surface 2 or 3, it's only output to None.

papr commented 2 years ago

Can you clarify which code you are using?

ChiwoongLEE commented 2 years ago

This is the result of modifying the "def recent_event" part of the code of "surface_tracker.py" to suit my purpose.

def recent_events(self, events):
        frame = events.get("frame")
        self.current_frame = frame
        if not frame:
            return

        self._update_markers(frame)
        self._update_surface_locations(frame.index)
        self._update_surface_corners()
        #events["surfaces"] = self._create_surface_events(events, frame.timestamp)
        events["surfaces"] = self._create_surface_events(events, frame.timestamp)
##
        surfaces=events["surfaces"]        
        norm_gp_x=0.0 #초기값설정
        norm_gp_y=0.0 #초기값설정
        norm_pos = [0.0, 0.0]
        confidence=0
        try:
            if self.gaze_flag:      
                try:                        
                    if 'gaze_on_surfaces' in surfaces[0]:

                        gaze_time_and_surface_name=()
                        c=[]
                        if surfaces[0]['gaze_on_surfaces'][0]['confidence']>0.6 and surfaces[0]['gaze_on_surfaces'][0]['on_surf']==True:
                            gaze_time_and_surface_name=surfaces[0]['gaze_on_surfaces'][0]['timestamp'],surfaces[0]['name']#튜플로저장
                            gaze_time_and_surface_name=list(gaze_time_and_surface_name)
                        #print(gaze_time_and_surface_name)
                        #a=sorted(gaze_time_and_surface_name)
                        #print(a)
                            c.append(gaze_time_and_surface_name)

                    if gaze_time_and_surface_name:

                        c.sort()
                        #gaze_time_and_surface_name.sort()
                        surface_name=c[-1][1]

                    else:
                        surface_name=None 

                    if surface_name=='left':

                        norm_gp_x= str(1920*surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][0])
                        norm_gp_y= str(1080*(1-surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][1]))
                    elif surface_name=='mid':

                        norm_gp_x= str(3840-(1920*surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][0]))
                        norm_gp_y= str(1080*(1-surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][1]))                                      
                    elif surface_name=='right':

                        norm_gp_x= str(5760-(1920*surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][0]))
                        norm_gp_y= str(1080*(1-surfaces[0]['gaze_on_surfaces'][0]['norm_pos'][1]))  

                    with open("/home/kaai/pupil/recordings/gaze_point102.csv","a") as f:
                        while True:
                            f.write(str(c)+','+str(surface_name)+','+str(norm_gp_x)+ ',' +str(norm_gp_y) + '\n')
                        #f.write(str(norm_gp_x)+ ',' +str(norm_gp_y) + '\n')
                            if KeyboardInterrupt:
                                break
                    self.pub_socket.send("ROS_gaze:norm_pos:".encode() + ":".join(str(e) for e in str(norm_gp_x)).encode())
                    self.pub_socket.send("ROS_gaze:norm_pos:".encode() + ":".join(str(e) for e in str(norm_gp_y)).encode())

                except IndexError:
                    #surfaces[0]['name']="OUT OF SIGHT"
                    norm_gp_x1= 0.0
                    norm_gp_y1= 0.0
                    norm_gp_x=str(norm_gp_x1)
                    norm_gp_y=str(norm_gp_y1)                        
                    with open("/home/kaai/pupil/recordings/gaze_point102.csv","a") as f:
                        while True:
                            #f.write(str(surfaces[0]['name'])+','+str(norm_gp_x)+ ',' +str(norm_gp_y) + '\n')
                            f.write('OUT OF SIGHT'+','+str(norm_gp_x)+ ',' +str(norm_gp_y) + '\n')
                            #f.write(str(norm_gp_x)+ ',' +str(norm_gp_y) + '\n')
                            if KeyboardInterrupt:
                                break    
                    self.pub_socket.send("ROS_gaze:norm_pos:".encode() + ":".join(str(e) for e in norm_gp_x).encode())
                    self.pub_socket.send("ROS_gaze:norm_pos:".encode() + ":".join(str(e) for e in norm_gp_y).encode())

        except KeyboardInterrupt:
            self.pub_socket.close()

papr commented 2 years ago

Please have a closer look at the surfaces object. It is a list that contains all detected surfaces. This list can be empty (in which case you get the index error -> everything out of sight), have one, or multiple entries.

In any case, you are always looking at the first entry and ignoring all other with this code:

                    if 'gaze_on_surfaces' in surfaces[0]:

                        gaze_time_and_surface_name=()
                        c=[]
                        if surfaces[0]['gaze_on_surfaces'][0]['confidence']>0.6 and surfaces[0]['gaze_on_surfaces'][0]['on_surf']==True:
                        ....

Note the difference between a for-loop (looking at all entries, see my code example) and an if-statement (only looking at a single entry).

When you adjusted my code example, you removed the most important part: The for loop.

for surface in surfaces:
    for gaze in surface["gaze_on_surfaces"]:
        ...

ChiwoongLEE commented 2 years ago

Thanks for your advice.

pupil-labs / pupil

surface recognition problem #2247