Some features are always calculated to be NaN

MichaelCurrie commented 9 years ago

In investigating the histogram output on our sample of 10 Schafer feature files, it appears some features are never calculated.

This comes from the following lines of generate_stats.py:

    base_path = os.path.abspath(mv.user_config.EXAMPLE_DATA_PATH)
    root_path = os.path.join(base_path, '30m_wait')

    exp_histogram_manager, ctl_histogram_manager = \
        obtain_histograms(root_path, "pickled_histograms.dat")

    ctl_histogram_manager.plot_information()

(Pdb) feature_spec.ix[blank_feature_list][['feature_field', 'data_type', 'motion_type']]
                                       feature_field data_type motion_type
560         locomotion.crawling_bends.head.amplitude       All     Forward
561         locomotion.crawling_bends.head.amplitude  Absolute     Forward
562         locomotion.crawling_bends.head.amplitude  Positive     Forward
563         locomotion.crawling_bends.head.amplitude  Negative     Forward
576      locomotion.crawling_bends.midbody.amplitude       All     Forward
577      locomotion.crawling_bends.midbody.amplitude  Absolute     Forward
578      locomotion.crawling_bends.midbody.amplitude  Positive     Forward
579      locomotion.crawling_bends.midbody.amplitude  Negative     Forward
592         locomotion.crawling_bends.tail.amplitude       All     Forward
593         locomotion.crawling_bends.tail.amplitude  Absolute     Forward
594         locomotion.crawling_bends.tail.amplitude  Positive     Forward
595         locomotion.crawling_bends.tail.amplitude  Negative     Forward
608         locomotion.crawling_bends.head.frequency       All     Forward
609         locomotion.crawling_bends.head.frequency  Absolute     Forward
610         locomotion.crawling_bends.head.frequency  Positive     Forward
611         locomotion.crawling_bends.head.frequency  Negative     Forward
624      locomotion.crawling_bends.midbody.frequency       All     Forward
625      locomotion.crawling_bends.midbody.frequency  Absolute     Forward
626      locomotion.crawling_bends.midbody.frequency  Positive     Forward
627      locomotion.crawling_bends.midbody.frequency  Negative     Forward
640         locomotion.crawling_bends.tail.frequency       All     Forward
641         locomotion.crawling_bends.tail.frequency  Absolute     Forward
642         locomotion.crawling_bends.tail.frequency  Positive     Forward
643         locomotion.crawling_bends.tail.frequency  Negative     Forward
675                          posture.coils.frequency       All         All
676                         posture.coils.time_ratio       All         All
679          locomotion.turns.omegas.event_durations  Positive         All
683      locomotion.turns.omegas.time_between_events  Positive         All
687  locomotion.turns.omegas.distance_between_events  Positive         All
689                locomotion.turns.omegas.frequency       All         All
690               locomotion.turns.omegas.time_ratio       All         All
703              locomotion.turns.upsilons.frequency       All         All
704             locomotion.turns.upsilons.time_ratio       All         All
709       locomotion.motion_events.forward.frequency       All         All
710      locomotion.motion_events.forward.time_ratio       All         All
711      locomotion.motion_events.forward.data_ratio       All         All
716        locomotion.motion_events.paused.frequency       All         All
717       locomotion.motion_events.paused.time_ratio       All         All
718       locomotion.motion_events.paused.data_ratio       All         All
723      locomotion.motion_events.backward.frequency       All         All
724     locomotion.motion_events.backward.time_ratio       All         All
725     locomotion.motion_events.backward.data_ratio       All         All

MichaelCurrie commented 9 years ago

Most are okay, though:

(Plots from HistogramManager.plot_information())

Cumulative invalid histograms in a 10-video sample (including features where some videos had valid histograms and some did not):

(i.e. for all 726 features to be invalid we'd have 7260 invalid histograms.)

JimHokanson commented 9 years ago

Does this still happen? Are you using a cached version of the histograms? I fixed a bug about a weak ago in which one feature was no longer being computed. This is related to the issue I setup regarding the comparison being too lenient when merging nans #152

MichaelCurrie commented 9 years ago

It's definitely not because of the cached (pickled) histograms in generate_stats.py; I've definitely deleted my pickle file this past week.

I agree this is related, and as you say in that issue it means we may be fooling ourselves when we think we are agreeing with the Schafer code in the cases where we are generating all NaNs for the feature.

openworm / open-worm-analysis-toolbox

Some features are always calculated to be NaN #153