Predicting on large DF runs into infinite loop

janosh commented 4 years ago

I've been trying to work around what might be a bug in (auto-)matminer. Trying to make predictions for a large dataframe (around 80000 rows) never finishes. I think the culprit might be guessing oxidation states as that seems to a long time and also increases rapidly in run time from one prediction to the next when slicing up the dataframe into chunks and predicting on each chunk individually.

@ardunn I couldn't create a minimal example with dummy data that reproduces this issue but maybe you can try to run this script and see if you experience the same issue.

janosh commented 4 years ago

Turns out that if I only use an ElementProperty featurizer (which generates the only features that are retained anyway), the problem disappears.

import automatminer as amm
import matminer as mm

featurizers = {
    "composition": [mm.featurizers.composition.ElementProperty.from_preset("magpie")],
    "structure": [],
}
pipe_config = {
    **amm.get_preset_config(),
    "autofeaturizer": amm.AutoFeaturizer(
        featurizers=featurizers,
        guess_oxistates=False,
    ),
}

pipe = amm.MatPipe(**pipe_config)

ardunn commented 4 years ago

Hey @janosh thanks for the bug report. I've been aware of this problem for some time and am actually currently running some tests to try and pinpoint it.

I actually think this is a bug with matminer and job parallelization with mulitprocessing. For example, if you try just using StructuretoOxidStructure etc. from matminer I'd wager you'd see the same issues.

What I think is happening behind the scenes is when n_jobs is high (relative to the compute ability of whatever machine you are running it on), the expensive chunks are delegated very few compute cycles by the CPU and/or are not allocated sufficient memory. I don't think there is any infinite loop happening (AFAIK) but the CPU is not allowing a highly parallelized process to run efficiently.

Some tests to try

Does running the bare featurizers (without automatminer) still have this problem? My guess is yes.

If so, does setting n_jobs for an individual featurizer change the halting behavior whatsoever? My guess is that if you set n_jobs=1 the job will go very slowly but eventually finish, and if you turn n_jobs very high you increase the probability it halts indefinitely.

ardunn commented 4 years ago

https://matsci.org/t/autofeaturizer-stuck-at-sinecoulomb-matrix/34423

ardunn commented 4 years ago

https://matsci.org/t/compositiontooxidcomposition-hangs/34383/3

hackingmaterials / automatminer

Predicting on large DF runs into infinite loop #272

Some tests to try