integrate text stimulus into GazeDataFrame

SiQube commented 1 year ago

what was your idea here? OCR of the images provided?

dkrako commented 1 year ago

this refers to the slides of my pitch presentation last week

dkrako commented 10 months ago

Regarding the integration of text stimuli, I would propose the following design:

Types of stimuli

We can expect to encounter different kinds of stimuli during further development. Apart from TextStimulus this could also be ImageStimulus, ShapeStimulus, VideoStimulus, GameStimulus and so on. Although all these types may have completely different datastructures, it makes sense to keep that in mind during design.

We will focus on TextStimulus first.

Use Case

When designing the integration of text stimuli into pymovements we need to focus on which types of results need the consideration of stimulus data.

For all types of stimuli we will probably have some kind of AOI representation. The most prominent use case is then to calculate measures (e.g. first fixation duration) in respect to each AOI or groups of AOIs.

We will focus on computing measures for rectangular AOIs first.

Name space

Create a new stimulus name space and have subspsaces for each type of stimulus. This would look like this:

pymovements.stimulus.text  # has TextStimulus
pymovements.stimulus.image  # has ImageStimulus
pymovements.stimulus.video  # has VideoStimulus

Class composition

The TextStimulus class is composed of three attributes:

text: the presented text represented as a single string
aois: a mapping from one or multiple characters in the text string to XYWH
image: optional image of the rendered and presented text

Loading stimulus data

Loading stimulus data using an interface like this:

stimulus = pymovements.stimulus.text.from_files(
    text='path/to/text/file.txt',
    aoi='path/to/aoi.csv',
    image='optional/path/to/image.jpg',
)

AOI mapping

The mapping can be a polars dataframe with the columns:

index: the index of the character in the text string
string: the substring captured in the aoi
pixel_x: pixel x-position of the top left corner
pixel_y: pixel y-position of the top left corner
width: pixel width of aoi box
height: pixel height of aoi box

The index must be ordered always.

It can potentially have additional columns

page: the page index
line: the line on the page
column: the character index in the line, the naming can be improved here
word: the word this character/substring belongs to

Integration into GazeDataFrame

Each GazeDataFrame can be assigned a stimulus during initialization like this:

gaze = pymovements.GazeDataFrame(
    ...
    stimulus=stimuls,
)

gaze = pymovements.from_file(
    path='/path/to/gaze.csv',
    stimulus=stimulus,
)

Computing measures for AOIs

gaze.compute_aoi_measure('first fixation duration', level='word')

Caveats

multi character aois

I'm not sure, but maybe the index in the AOI mapping can support both an integer or a tuple/slice?

multi page text stimuli

I'm unsure how well mulit page text stimuli can be integrated into this design. We wouldn't want to need to split GazeDataFrames by pages, as this creates a lot of overhead. So multi page text stimulus support is a must I think.

One idea is to have some a simple dataframe with the columns time and page.

The attributes text, aois, and image could have a new top level with the keys being the pages. The indexing would go like this:

stimulus.text[page_id], stimulus.image[page_id]

An alternative would be the other way around:

stimulus.pages[0].text, stimulus.pages[0].image

I prefer the former.

not much experience with gaze text data

This is a first proposel which should open up the discussion on developing this feature. I don't have that much experience with working with text aois. It is expected that I overlook some further issues. Each proposed point is open for discussion.

aeye-lab / pymovements