olgaliak / active-learning-detect

Active learning + object detection
MIT License
100 stars 33 forks source link

support for class-balancing for download-vott-json flow #41

Closed olgaliak closed 5 years ago

yashpande commented 5 years ago

Hey Olga! Hope you're doing well!!

I'm sure everything works but there were a few small things I noticed:

1) The comments on config.ini after line 9 were a little confusing. I didn't get how the "10% not detected" part works. Maybe add more on that to the config_description.md file instead of as a comment?

2) In line 155 of download_vott_json you used isclose(s, 1, abs_tol=0.01) which makes sense but I think a more readable option would be abs(s-1)<0.01 since (hopefully) there won't be NaN/Inf values.

3) In prepare_per_class_dict (around line 132) for download_vott_json you could probably use a defaultdict(list) for cleaner code (you could say result=defaultdict(list) and then delete line 133 and 134). Also you should probably make classes a set instead of an np array to save time on line 132.

4) In line 177 onwards for get_top_rows you could move the for loop in line 181 inside the if statement in line 177 to make it easier to understand. Something like:

if ideal_class_balance is not None: 
    class_balances_cnt = len(ideal_class_balance)
    for folder_name in all_files:
        all_files_per_class = prepare_per_class_dict(all_files[folder_name], class_balances_cnt, tag_names)
        # Rest of line 184 onwards

Feel free to ignore everything I said - that's why I put it in a comment here and not on the actual code 😄

olgaliak commented 5 years ago

@yashpande , thanks a lot for looking into PR, great comments, I will address them! :)

abfleishman commented 5 years ago

@olgaliak I am not sure this is a result from this PR but I pulled the latest and I am working on master. I updated my config and I am trying to download images for tagging and asked for 5 images and it started downloading 1000s! and each photo is listed multiple times. Seems fishy... here is a screenshot of output. image and the config is attached. config_bbalous.zip

Let me know if I am misspecifying something or if you need more info to look into this