The answer is degrees visual angle. In other words the x positions are in degrees azimuth, where 0 is the centre of the visual field, -ve is the left; y positions are in degrees altitude, where 0 is the centre of the visual field and -ve is below the centre.
Saw the 'WHAT UNIT' in the sparse noise table definition: https://github.com/int-brain-lab/IBL-pipeline/blob/fe3d00bf24b2035f6100db4296cfaebfdd08b0cb/ibl_pipeline/behavior.py#L229-L230
The answer is degrees visual angle. In other words the x positions are in degrees azimuth, where 0 is the centre of the visual field, -ve is the left; y positions are in degrees altitude, where 0 is the centre of the visual field and -ve is below the centre.