minor speed bump attempt #2

Made two changes to skip the setting of dtype to float during during launching of methods when the dtype is already np.float32.

Refactor how build_cluster was retrieving and calculating the cluster means to make use of numpy's means method which is a lot faster than applying it through pandas. Now running on 200k cells actually finishes in about 5 hours.

There's still a bit where I think can be sped up but I don't understand the code enough to work out how the for-loop is retrieving and updating the interaction database and the base_result object:

Particularly in mean_analysis and percent_analysis (both functions happen during the Running Real Analysis step; mean_analysis happens with every iteration during shuffled_analysis in Running Statistical Analysis step.

Both functions contain this for-loop:

for interaction_index, interaction in interactions.iterrows():
        for cluster_interaction in cluster_interactions:
             ...
        # ending in something like this
        result.at[index, column] = value
return result

the same goes for build_percent_result, which has the same starting statement.

Also cython requirement is still there and currently doesn't work unless the version is increased to >=0.29.21 for python 3.8

Teichlab / cellphonedb

minor speed bump attempt #2 #275