probcomp / crosscat

A domain-general, Bayesian method for analyzing high-dimensional data tables
http://probcomp.csail.mit.edu/crosscat/
Apache License 2.0
322 stars 42 forks source link

crosscat does stupidly many deep copies #57

Open riastradh-probcomp opened 9 years ago

axch commented 9 years ago
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001   61.224   61.224 check_stability.py:17(<module>)
        1    0.000    0.000   60.961   60.961 check_stability.py:60(analyze_fileset)
        4    0.000    0.000   60.923   15.231 check_stability.py:119(analyze_queries)
       68    0.001    0.000   60.547    0.890 bayesdb.py:131(execute)
       16    0.000    0.000   60.480    3.780 check_stability.py:32(country_purpose_queries)
       68    0.011    0.000   60.442    0.889 bql.py:38(execute_phrase)
       16    0.001    0.000   60.258    3.766 bqlfn.py:327(bayesdb_simulate)
       16    0.021    0.001   60.256    3.766 crosscat.py:1176(simulate)
       16    0.000    0.000   59.559    3.722 LocalEngine.py:293(simple_predictive_sample)
       16    0.000    0.000   59.559    3.722 LocalEngine.py:822(_do_simple_predictive_sample)
       16    0.000    0.000   59.558    3.722 sample_utils.py:251(simple_predictive_sample_multistate)
       24    0.000    0.000   59.557    2.482 sample_utils.py:219(simple_predictive_sample)
       24    1.040    0.043   59.557    2.482 sample_utils.py:531(simple_predictive_sample_unobserved)
    25000    0.106    0.000   55.931    0.002 sample_utils.py:518(create_cluster_model_from_X_L)
    30894    0.470    0.000   54.757    0.002 sample_utils.py:370(create_cluster_model)
   211693   10.927    0.000   54.291    0.000 sample_utils.py:358(create_component_model)
17285763/211693   21.519    0.000   43.120    0.000 copy.py:145(deepcopy)

The workload calls "simulate limit 1000" a few times.

axch commented 9 years ago

Fixed by 68eb07a0639e48fe5edc4482aabba1aebf686fb0

riastradh-probcomp commented 9 years ago

This is not the only case. See also ensure_multistate in sample_utils.py: https://github.com/probcomp/crosscat/blob/68eb07a0639e48fe5edc4482aabba1aebf686fb0/crosscat/utils/sample_utils.py#L850-L855