hsosik / ifcb-analysis

custom resolver
Other
30 stars 18 forks source link

unclassified counts: trouble shooting unexpected counts in unclassified from countcells_MVCO_manual #24

Open eepeacock opened 7 years ago

eepeacock commented 7 years ago

Heidi was using the summary file created by countcells_MVCO_manual, and found many files with counts in "unclassified".

This indicates some errors in our work flow, because for all files that unclassified is counted, it should be empty.

We both feel like we addressed this before, but I don't see any issues about it.

I found 2 sources of error:

  1. In "get_annotated_ckassesMVCO" the case for 'diatoms' needs to exclude counting 'unclassified', but it was not. (again, I feel like we fixed this before). I just checked the log of changes in github and it says that in NOvember of 2016, committed by hsosik, "add new ciliate classes and update diatom case to include exclusion of unclassified". WHY do we not have that change now in the repository?????

  2. There were lots of files with 1-2 ROIs in unclassified. They were all "all_categories" files, so it is correct to be counting unclassified. I batch classified these files for unclassified, and it seemed like mostly ROIs that could have almost been ciliates, but had been moved out of ciliate categories. I moved most of them to 'other' or 'dino' or 'flagellate'. We don't know exactly why/how they got there, but I did check with Emily Brownlee, to make sure that if she removes things from ciliate categories, she puts them in 'other' or another real category. This didn't seem surprising to her, but maybe one of us did it a while ago.

eepeacock commented 7 years ago

Heidi looked at the history of get_annotated_classesMVCO with me, and I had mis understood the syntax of the changes. Somehow we had added the needed change, and then accidentally undone with the the next revision on the same day. We are going to recommit with the 'unclassified' excluded.