I believe there may be an error in the reference implementation, in the __init__() function for the XView3Dataset class in dataloader.py that causes a significant number of samples to incorrectly be labeled as from class 1/FISHING.
The results hold for both the tiny and full datasets. If we look at the chipping annotations csv generated for the tiny validation set:
tiny_valid_chips = pd.read_csv('/home/ubuntu/xview3/process_uuid/validation/val_chip_annotations.csv')
print(tiny_valid_chips[['scene_id', 'chip_index', 'is_vessel', 'is_fishing', 'confidence', 'vessel_class']].head())
# Output:
scene_id chip_index is_vessel is_fishing confidence vessel_class
0 b1844cde847a3942v 333 True NaN MEDIUM 1
1 b1844cde847a3942v 303 True NaN LOW 1
2 b1844cde847a3942v 824 False NaN HIGH 3
3 b1844cde847a3942v 404 True NaN MEDIUM 1
4 b1844cde847a3942v 404 False NaN HIGH 3
We can see in the first row, is_vessel is True, is_fishing is NaN. This should result in a label of 2/NONFISHING, however the label ends up with 1/FISHING. If we do a grouping (filling in NaN values, which pandas will drop if they are in the groupby keys), we get:
groupby_cols = ['is_vessel', 'is_fishing', 'confidence', 'vessel_class']
tst = tiny_valid_chips[groupby_cols]
tst = tst.fillna('Null')
print(tst.groupby(groupby_cols).size())
# Output
is_vessel is_fishing confidence vessel_class
False Null HIGH 3 400
MEDIUM 3 26
True False HIGH 2 272
MEDIUM 2 3
True HIGH 1 19
Null LOW 1 329
MEDIUM 1 303
Null Null LOW 1 10
So we can see that cases where is_vessel is True and is_fishing is NaN are always labeled as 1/FISHING. This occurs for both LOW and MEDIUM confidence labels in the tiny set. Additionally, cases where both is_vessel and is_fishing are NaN are also labeled as 1/FISHING.
If we do the same analysis for the chipping annotations csv generated for the tiny training set:
We can see that cases where is_vessel is True and is_fishing is NaN don't occur in this set. However, cases where both columns are NaN do, and they are labeled as 1/FISHING.
The issue seems to be in the loop in lines 271 - 277:
self.detections = pd.read_csv(detect_file, low_memory=False)
vessel_class = []
for ii, row in self.detections.iterrows():
if row.is_vessel and row.is_fishing:
vessel_class.append(FISHING)
elif row.is_vessel and not row.is_fishing:
vessel_class.append(NONFISHING)
elif not row.is_vessel:
vessel_class.append(NONVESSEL)
The first conditional statement,
if row.is_vessel and row.is_fishing:
vessel_class.append(FISHING)
is meant to test if both is_vessel and is_fishing are True. However, the Pandas NaN will also evaluate to True. Here is an example data point:
tst = tiny_valid_chips[(tiny_valid_chips['is_vessel']==True) & (tiny_valid_chips['is_fishing'].isnull())]
example = tst.iloc[0]
print(f"Detect ID: {example['detect_id']}")
print(example[['is_vessel', 'is_fishing', 'vessel_class']])
if example['is_vessel']:
print('Test 1')
if example['is_fishing']:
print('Test 2')
if not np.isnan(example['is_fishing']):
print('Test 3')
# Output
Detect ID: b1844cde847a3942v_006.46879283500000035190_003.47593584100000008164
is_vessel True
is_fishing NaN
vessel_class 1
Name: 0, dtype: object
Test 1
Test 2
So the first conditional statement will classify is_vessel/is_fishing combinations of True/True, NaN/NaN, True/NaN all as class 1/FISHING.
For a dataset with a random subset of 300 of the training scenes chipped, the label distribution looks like:
is_vessel is_fishing confidence vessel_class
False Null HIGH 3 5418
MEDIUM 3 1730
True False HIGH 2 2868
MEDIUM 2 62
True HIGH 1 933
MEDIUM 1 28
Null LOW 1 3315
MEDIUM 1 4751
Null Null LOW 1 119
So there end up with (4751 + 3315 + 119) = 8185 of the 19224 detections labeled as 1, seemingly incorrectly.
I believe there may be an error in the reference implementation, in the
__init__()
function for theXView3Dataset
class indataloader.py
that causes a significant number of samples to incorrectly be labeled as from class 1/FISHING
.The results hold for both the tiny and full datasets. If we look at the chipping annotations csv generated for the tiny validation set:
We can see in the first row,
is_vessel
isTrue
,is_fishing
isNaN
. This should result in a label of 2/NONFISHING
, however the label ends up with 1/FISHING
. If we do a grouping (filling inNaN
values, which pandas will drop if they are in the groupby keys), we get:So we can see that cases where
is_vessel
isTrue
andis_fishing
isNaN
are always labeled as 1/FISHING
. This occurs for bothLOW
andMEDIUM
confidence labels in the tiny set. Additionally, cases where bothis_vessel
andis_fishing
areNaN
are also labeled as 1/FISHING
.If we do the same analysis for the chipping annotations csv generated for the tiny training set:
We can see that cases where
is_vessel
isTrue
andis_fishing
isNaN
don't occur in this set. However, cases where both columns areNaN
do, and they are labeled as 1/FISHING
.The issue seems to be in the loop in lines 271 - 277:
The first conditional statement,
is meant to test if both
is_vessel
andis_fishing
areTrue
. However, the PandasNaN
will also evaluate toTrue
. Here is an example data point:So the first conditional statement will classify
is_vessel
/is_fishing
combinations ofTrue
/True
,NaN
/NaN
,True
/NaN
all as class 1/FISHING
.For a dataset with a random subset of 300 of the training scenes chipped, the label distribution looks like:
So there end up with (4751 + 3315 + 119) = 8185 of the 19224 detections labeled as 1, seemingly incorrectly.