Open zcrysler opened 6 years ago
The XXX_for_tag_project
items in the current data API are not returning ambiguous detection, so they're not ending up in tag project .motus
files. (In contrast, ambiguous detections are returned by the XXX_for_recvitems, and so **are** in receiver
.motus` files). This is because ambiguous detections definitely belong to the receiver owner, but we don't know which tag owner they belong to.
This is a big reason why tag project datasets are incomplete. Below is the number of registered tags by project, where there's at least one detection that could be the tag, but is ambiguous. For each of those tags, there will be some possible detections that have not been downloaded into tag project .motus
files, because we don't know for sure which project the detections belong to.
Obviously fixing this is high priority.
+-----------+--------------+
| projectID | numAmbigTags |
+-----------+--------------+
| 1 | 4 |
| 3 | 1 |
| 9 | 3 |
| 10 | 2 |
| 11 | 8 |
| 14 | 34 |
| 15 | 28 |
| 19 | 20 |
| 20 | 20 |
| 28 | 3 |
| 29 | 2 |
| 32 | 6 |
| 37 | 4 |
| 38 | 40 |
| 40 | 8 |
| 42 | 1 |
| 45 | 19 |
| 46 | 1 |
| 47 | 41 |
| 48 | 6 |
| 49 | 2 |
| 50 | 15 |
| 57 | 92 |
| 63 | 2 |
| 64 | 33 |
| 65 | 13 |
| 67 | 12 |
| 74 | 2 |
| 78 | 20 |
| 81 | 19 |
| 82 | 33 |
| 87 | 2 |
| 88 | 1 |
| 90 | 11 |
| 91 | 2 |
| 92 | 37 |
| 95 | 1 |
| 97 | 1 |
| 100 | 5 |
| 103 | 19 |
| 109 | 2 |
| 111 | 1 |
| 124 | 6 |
| 140 | 2 |
| 146 | 33 |
| 153 | 6 |
| 154 | 3 |
+-----------+--------------+
To quickly get data for project P
, we currently use this
scaffolding:
projBatch
with tagDepProjectID == P
batchRuns
with tagDepProjectID == P
, batchID == :batchID
hits
with tagDepProjectID == P
, batchID == :batchID
To extend this to ambiguous tag detections, where we don't know the tagDepProjectID, we can generate new, negative projectIDs. Each represents a unique set of up to say 6 real projects (to match the up to 6 possible tags in an ambiguous tag record). An ambiguous project ID is generated for each set of projectIDs for which there's a tag ambiguity connecting those projects. That way, when an ambiguous tag is detected (negative motusTagID), we assign can assign a corresponding ambiguous projectID.
Moreover, the same optimized queries that fetch batches, hits, and runs by projectID will work unchanged for these new, negative projectIDs.
We need a new table to manage ambiguous project IDs
CREATE TABLE projAmbig (
ambigProjectID SMALL INT UNIQUE PRIMARY KEY NOT NULL, -- unique negative ID
projectID1 SMALL INT, -- motus project ID #1
projectID2 SMALL INT, -- motus project ID #2; projectID2 > projectID1
projectID3 SMALL INT,
projectID4 SMALL INT, -- ... projectID(n+1) > projectID(n)
projectID5 SMALL INT,
projectID6 SMALL INT, -- motus project ID #6 projectID6 > projectID5
UNIQUE PRIMARY KEY (projectID1, projectID2, projectID3, projectID4, projectID5, projectID6)
);
CREATE INDEX projAmbig_ambigProjectID on projAmbig(ambigProjectID);
Also, table tagAmbig
needs a new column ambigProjectID
which will
link to table projAmbig
. That will allow us to look up tagDepProjectID
for ambiguous runs by looking up motusTagID
in table tagAmbig
.
when a hit or run is assigned a negative tagDepProjectID, that means the hit or run could belong to any of projectID1, ..., projectID6 listed in the record where ambigProjectID = tagDepProjectID (< 0).
We also need a new API entry: ambiguous_projects_for_tag_project (P)
that returns the list of negative projectIDs of which P
is a member.
The client-side code then needs to grab all batches, hits, runs for
each of those ambiguous project IDs, merging locally. The only
complication with merging is that a batch record might be fetched for
two different project IDs, in which case the numHits field should be
summed, but otherwise nothing changes.
The rest of the code should work as-is, but of course on the client side, we'll need to do some finagling to mark ambiguous detections, and/or lookup metadata by ambiguous project.
So detections showing up as motusTagID "-42" from receiver Lotek 223 for example, are ambiguous tags? They have no info for tagProjID, id, tagType, mfg, tagModel, tagBI, and fullID is "?proj?-0#:0.0"
FYI. Not sure if this will help or not in the Ambiguous arena, but Margaret Eng's project - # 146 is comprised primarily of ambigous tags - duplicates with RBrown American Woodcock #82 and James Bay - #38. On hourly plots (from Old Cut) most of her detections are labelled as one or the other, but many are not.
Some more details on Margaret's project 146:
I know you dealt with this in the plots you create, but in the detection data, is there a way to mark tags as ambiguous so that users know to look into detections for certain tags. In this case tags for project 64 are overlapping with 57. SELVA tags so it's pretty obvious for James, but it may not be for others down the road.