Erotemic / ibeis

image based ecological information system
Apache License 2.0
49 stars 17 forks source link

Using without GUI #79

Closed sadda closed 2 years ago

sadda commented 2 years ago

Hi,

thanks a lot for developing this software. I would like to ask whether there is a simple way to use HotSpotter/Ibeis without GUI. My case would be that all files are in one folder (or the file paths are stored in a list) and one would call something similar to

n = len(imgs)

descriptors = []
for img in imgs:
    descriptors.append(extract_keypoints_and_descriptors(img))

scores = np.array((n, n))
for i in range(n):
    for j in range(i+1, n):
        score[i, j] = similarity_score(descriptors[i], descriptors[j])

Thanks a lot,

Lukas

Erotemic commented 2 years ago

Yes, everything (and more) can be done outside of the GUI.

import ibeis
import utool as ut
gpath1 = ut.grab_test_imgpath('astro.png')
gpath2 = ut.grab_test_imgpath('carl.jpg')
ibs = ibeis.opendb('my_new_db', allow_newdir=True) 

image_id_list = ibs.add_images([gpath1, gpath2])

# Make one box for each image
# Annotations are defined by boxes (with optional rotation) in images.
image_dsize_list = ibs.get_image_sizes(gids)
box_list = [[0, 0, w, h] for (w, h) in image_dsize_list]

# Create the new annotations. 
# The matching algorithms work wrt to these annotation ids
annot_id_list = ibs.add_annots(image_id_list, box_list)

# If you want raw keypoint & descriptor access (note: config2_ can change the configuration, although its not a very intuitive interface, look in the code for examples).
kpts_list = ibs.get_annot_kpts(annot_id_list)
desc_list = ibs.get_annot_vecs(annot_id_list)

# You can call hotspotter (or another matching algorithm) directly 
# Define query ids and the database ids they search against
query_annot_ids = annot_id_list
database_annot_ids = annot_id_list

query_request = ibs.new_query_request(query_annot_ids, database_annot_ids)
chipmatch_list = query_request.execute()

for chipmach in chipmatch_list:
    print(f'chipmach={chipmach}')
    query_aid = chipmach.qaid

    # Mapping from each database annotation to the feature match indexes (wrt the kpts/vecs arrays)
    chipmach.aid2_fm

    # Mapping from each database annotation to the score of each feature match against this query
    chipmach.aid2_fs

    # Mapping from database annots to the final scalar score against this query
    chipmach.aid2_score

Also without the GUI:

# Given a chip match
fig = chipmach.show_analysis(qreq_=query_request)
fig.savefig('match_viz.jpg')

# But you do need to use some sort of image viewer to see it
# This uses whatever your system opens jpgs with. 
ut.startfile('match_viz.jpg')

The above code returns this:

match_viz

sadda commented 2 years ago

Thanks a lot. It took me some time to go through it and process it. If I use Ibeis on a clean database (empty folder), it works perfectly. I have a feeling that something is off when I rerun the same code with different database_annot_ids. For example, I run the code with database_annot_ids between 1 and 60. Then I changed it and obtained these results (sorry for the formatting):

.>>> database_annot_ids [1, 2, 21, 22, 41, 42] .>>> chipmach.aid2_score {34: 0.31773189042375305, 18: 3.1777660224881323, 8: 0.3280498285998196, 27: 0.17678951225783912, ... 42: 0.10928233033349666, 56: 0.03800948456348979, 39: 0.3439561757804006, ...

The indices do not correspond and there are more than six matches, which is the number of database images. When I run the same code on a clean database, it works.

Another question: when chipmach.aid2_score is an empty dictionary, does it mean that it did not find a match?

Erotemic commented 2 years ago

If you create a new query request object, I don't think you should see the case where they don't match. You can't set the attribute on the query request object, you have to construct a new one for each new query.

Try:

database_annot_ids = [1, 2, 21, 22, 41, 42]
query_annotation_id = 5  # whatever annotation you are interested in querying
qreq_ = ibs.new_query_request(qaid_list=[query_annotation_id], daid_list=database_annot_ids)
chipmatch = qreq_.execute()[0]

The attributes of chipmatch chipmatch.daid_list and chipmatch.aid2_score.keys() should match what you specified in database_annot_ids (it might be a subset for cases where there were no matches, but it should never contain ids that were not in the original qreq_.daids).

You can use list(ibs.annots()) (new way) ibs.get_valid_aids() (old way) to get a list of all of the annotation ids in your database. Make sure you only use annotations that exist in your queries.

If you can demonstrate a case where you do really get database annotations in your chipmatch that were not in the original daid_list, then that is a serious bug that needs to be fixed. I just don't think the likelihood of that existing is very high. This program has been put through its paces. I imagine there are bugs in it, but I'm guessing they wouldn't be something that egregious.


SIDE NOTE:

Another thing that is VERY important to note: The scores you get back in aid2_score are NOT absolute. They are always relative to (1) the query and (2) all of the other database annotations (and also the configuration dictionary, but we aren't varying that so..). I see a lot of first time users with the misconception that if Annotation 1 matches Annotation 42 with a score of 1000, that is more likely to be a match than if Annotation 2 matches Annotation 88 with a score of 100. It very much depends on the images. The only thing the scores tell you is if the top ranked result is more likely to be a match than the second ranked result. Its better of thinking of them as defining an ordering, rather than an absolute score.

Technically, there is some information about what match is correct in the magnitude of the score, but its a lot less than you would think .

One of the important experiments from my thesis was demonstrating that the scores for "correct" and "incorrect" matches are not very easy to separate:

image

In the figure the red histogram is scores of incorrect ("negative") matches, and the blue histogram are scores of correct ("positive") matches. There is a threshold for which some of the correct matches can be automatically separated from the negative matches, but there are a ton of false negatives (incorrectly labeling positive matches as negatives). And there is no threshold that can determine if a match is absolutely incorrect.

And again, this was wrt a relatively constant database size (modulo the query annotation), so these scores will shift magnitudes as the size of (number of annotations in) the database changes.

sadda commented 2 years ago

It works now. Thanks a lot :)