Closed ChristianEschen closed 7 years ago
The LIDC dataset doesn't assign unique global identifiers to the physical nodules. For a given physical nodule, there may exist up to 4 annotations that refer to it. The annotations are anonymous, so even if it is known that 4 annotations refer to the same nodule, it is impossible to tell which annotator provided each annotation across multiple nodules consistently.
We can estimate when annotations refer to the same physical nodule in a scan by examining the properties of the annotations and clustering them based on the properties. pylidc
provides a number of distance metrics between annotations based on the annotation contour coordinates. The Scan model provides a cluster_annotations
function which clusters annotations by determining the connected components of the adjacency graph associated with a chosen distance metric and distance tolerance.
Here's an example:
import pylidc as pl
scan = pl.query(pl.Scan).first()
nods = scan.cluster_annotations()
print "Scan is estimated to have", len(nods), "nodules."
for i,nod in enumerate(nods):
print "Nodule", i+1, "has", len(nod), "annotations."
for j,ann in enumerate(nod):
print "-- Annotation", j+1, "centroid:", ann.centroid()
Output:
Scan is estimated to have 4 nodules.
Nodule 1 has 4 annotations.
-- Annotation 1 centroid: [ 331.90680101 312.30982368 1480.44962217]
-- Annotation 2 centroid: [ 328.60546875 309.91796875 1479.73046875]
-- Annotation 3 centroid: [ 327.91666667 309.88293651 1479.01785714]
-- Annotation 4 centroid: [ 332.55660377 313.88050314 1479.94339623]
Nodule 2 has 4 annotations.
-- Annotation 1 centroid: [ 360.81122449 169.19642857 1542.10459184]
-- Annotation 2 centroid: [ 360.82233503 169.21319797 1542.14720812]
-- Annotation 3 centroid: [ 361.05243446 168.86142322 1542.34269663]
-- Annotation 4 centroid: [ 361.25501433 171. 1542.80659026]
Nodule 3 has 1 annotations.
-- Annotation 1 centroid: [ 336.41666667 348.83333333 1545.75 ]
Nodule 4 has 4 annotations.
-- Annotation 1 centroid: [ 340.54020979 245.07692308 1606.14160839]
-- Annotation 2 centroid: [ 341.29061103 244.65275708 1605.90834575]
-- Annotation 3 centroid: [ 341.75417299 244.03490137 1606.95827011]
-- Annotation 4 centroid: [ 341.53110048 245.58532695 1606.5 ]
Wait, there's more! You can supply annotation clusters (variable, nods
, above) to the scan.visualize
function, and arrows will annotate where the nodules are present in the scan.
This comment should be really added to the documentation instead ...
Ok, see here.
Hello
Thanks for this nice software! In the LIDC dataset the contours are created from different radiologists.. Approximately 4 have independently annotated the nodules In order to construct more reliable segmentations /contours it is necessary to discriminate which annotator has annotated the nodules. Is it possible to retrieve information the information about which contours belong to a specific annotator (1,2,3,4)?