writing docs (please render the docs make -C docs html
writing doctests (you can run them via make -C docs doctest, however the CI does it already)
[x] Refactor current code and create pir_to_xy that is used inside of get_parallel_dataset
How do I try it?
See below a simple script that compares the local and the API xy_to_pir. Note that the main power of the local implementation is that we can transform a lot of coordinates at once (not shown in the script).
import numpy as np
from atldld.sync import xy_to_pir
from atldld.utils import get_2d_bulk, get_3d, xy_to_pir_API_single
dataset_id = 479
image_id = 101350196 # make sure it is indeed inside of the above dataset
affine_3d, _, section_thickness = get_3d(
dataset_id,
ref2inp=False, # very important, ref2inp=True is for pir_to_xy
return_meta=True,
)
metadata_2d = get_2d_bulk(
dataset_id,
ref2inp=False, # very important, ref2inp=True is for pir_to_xy
)
affine_2d, section_number = metadata_2d[image_id]
x = 1234 # feel free to modify
y = 555 # feel free to modify
coords_img = np.array(
[
[x],
[y],
[section_number * section_thickness],
]
)
p_API, i_API, r_API = xy_to_pir_API_single(x, y, image_id)
p_local, i_local, r_local = xy_to_pir(coords_img, affine_2d, affine_3d)[:, 0]
print(p_API, p_local)
print(i_API, i_local)
print(r_API, r_local)
Want to know more?
How did you download the responses?
First of all, they consists of both sagittal and coronal datasets. Additionally, not all datasets have the same section_thickness.
Python script
```python
import json
import pathlib
import sys
import numpy as np
import requests
from atldld.utils import (
xy_to_pir_API_single,
get_2d_bulk,
get_3d,
)
CONFIG = {
"coronal": [
100142355,
100142290,
1357,
71717640,
77371835,
],
"sagittal": [
75457491,
75457580,
75492803,
79913385,
100055064,
],
}
columns = [
"dataset_id",
"image_id",
"section_number",
"p",
"i",
"r",
"x",
"y",
"downsample_ref",
"matrix_2d",
"matrix_3d",
]
output_folder = pathlib.Path.cwd() / "tests" / "data" / "sync" / "xy_to_pir"
np.random.seed(10)
n_images_per_dataset = 2
n_points_per_image = 2
for axis, datasets in CONFIG.items():
for dataset_id in datasets:
metadata_2d = get_2d_bulk(
dataset_id,
ref2inp=False,
add_last=False,
)
for _ in range(n_images_per_dataset):
image_id = np.random.choice(list(metadata_2d.keys())).item()
for _ in range(n_points_per_image):
x = np.random.randint(0, 1000)
y = np.random.randint(0, 1000)
p, i, r = xy_to_pir_API_single(
x,
y,
image_id
)
matrix_3d, reference_space_id, section_thickness = get_3d(
dataset_id,
ref2inp=False,
add_last=False,
return_meta=True,
)
matrix_2d, section_number = metadata_2d[image_id]
print(dataset_id, x, y, image_id, matrix_2d, matrix_3d)
_refspace_check = {"coronal": 9, "sagittal": 10}
assert _refspace_check[axis] == reference_space_id, f"{axis}!={reference_space_id}"
res = {
"dataset_id": dataset_id,
"image_id": image_id,
"section_number": section_number,
"section_thickness": section_thickness,
"axis": axis,
"p": p,
"i": i,
"r": r,
"x": x,
"y": y,
"affine_2d": matrix_2d.tolist(),
"affine_3d": matrix_3d.tolist(),
}
path = output_folder / f"{image_id}_{x}_{y}.json"
with path.open("w") as f:
json.dump(res, f, indent=2)
```
So how exactly is the section_number from the image metadata used?
The input of xy_to_pir are coordinates of the following form: (x, y, section_number * section_thickness).
What about efficiency?
I decided to give higher importance to simpler API than to efficiency. Specifically, both pir_to_xy and xy_to_pir are concatenating a row of 1's to the input coordinates. This will lead to copying of data. However, at least the users won't have to do this themselves manually. What is the slowdown? Did not really benchmark, but it should be close to negligible.
Closes #74 and closes #50
What was done?
xy_to_pir
locally. It also includesmake -C docs html
make -C docs doctest
, however the CI does it already)pir_to_xy
that is used inside ofget_parallel_dataset
How do I try it?
See below a simple script that compares the local and the API
xy_to_pir
. Note that the main power of the local implementation is that we can transform a lot of coordinates at once (not shown in the script).Want to know more?
How did you download the responses?
First of all, they consists of both sagittal and coronal datasets. Additionally, not all datasets have the same
section_thickness
.Python script
```python import json import pathlib import sys import numpy as np import requests from atldld.utils import ( xy_to_pir_API_single, get_2d_bulk, get_3d, ) CONFIG = { "coronal": [ 100142355, 100142290, 1357, 71717640, 77371835, ], "sagittal": [ 75457491, 75457580, 75492803, 79913385, 100055064, ], } columns = [ "dataset_id", "image_id", "section_number", "p", "i", "r", "x", "y", "downsample_ref", "matrix_2d", "matrix_3d", ] output_folder = pathlib.Path.cwd() / "tests" / "data" / "sync" / "xy_to_pir" np.random.seed(10) n_images_per_dataset = 2 n_points_per_image = 2 for axis, datasets in CONFIG.items(): for dataset_id in datasets: metadata_2d = get_2d_bulk( dataset_id, ref2inp=False, add_last=False, ) for _ in range(n_images_per_dataset): image_id = np.random.choice(list(metadata_2d.keys())).item() for _ in range(n_points_per_image): x = np.random.randint(0, 1000) y = np.random.randint(0, 1000) p, i, r = xy_to_pir_API_single( x, y, image_id ) matrix_3d, reference_space_id, section_thickness = get_3d( dataset_id, ref2inp=False, add_last=False, return_meta=True, ) matrix_2d, section_number = metadata_2d[image_id] print(dataset_id, x, y, image_id, matrix_2d, matrix_3d) _refspace_check = {"coronal": 9, "sagittal": 10} assert _refspace_check[axis] == reference_space_id, f"{axis}!={reference_space_id}" res = { "dataset_id": dataset_id, "image_id": image_id, "section_number": section_number, "section_thickness": section_thickness, "axis": axis, "p": p, "i": i, "r": r, "x": x, "y": y, "affine_2d": matrix_2d.tolist(), "affine_3d": matrix_3d.tolist(), } path = output_folder / f"{image_id}_{x}_{y}.json" with path.open("w") as f: json.dump(res, f, indent=2) ```So how exactly is the
section_number
from the image metadata used?The input of
xy_to_pir
are coordinates of the following form:(x, y, section_number * section_thickness)
.What about efficiency?
I decided to give higher importance to simpler API than to efficiency. Specifically, both
pir_to_xy
andxy_to_pir
are concatenating a row of 1's to the input coordinates. This will lead to copying of data. However, at least the users won't have to do this themselves manually. What is the slowdown? Did not really benchmark, but it should be close to negligible.