Open kmdalton opened 2 weeks ago
This all looks good to me, but I'd like to play with it a bit more before merging. I think I agree with making ray
an optional dependency, but I don't think I like adding it to tests_require
-- seems like a hacky solution.
What are your thoughts on us adding an explicit parallel_require=["ray"]
, that is added to [dev]
and maybe a new [parallel]
pip option that only adds on the ray
extra requirement?
I'm happy to defer to your preferences regarding requirements. I don't have strong feelings as long as we make it easy for users to figure out how to get parallelism. I am not yet familiar enough with ray
to know how nicely it plays with other packages. So far it seems very promising.
@marinegor , would you be willing to test out this branch for us?
@kmdalton sure, I can have a look! what kind of testing are you thinking, could you elaborate? I imagine you want to make sure that your parser produces same results as the previous one, right?
Thank you! I don't have access to a lot of stream files, and I have noticed there can be some differences in the metadata between files. Mostly I want to make sure I'm not missing anything which will break the parser for some edge cases. Additionally, I would hope you could let us know
ray
, in order to run in parallelread_crystfel
and the StreamLoader
class are adequately documented$ conda install -c conda-forge "ray-default"
fails for me like this in a fresh conda environment in which I (tried and maybe failed) to use your careless install script:
Channels:
- conda-forge
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed
LibMambaUnsatisfiableError: Encountered problems while solving:
- nothing provides _python_rc needed by python-3.12.0rc3-rc3_hab00c5b_1_cpython
Could not solve for environment specs
The following packages are incompatible
├─ python 3.12** is installable with the potential options
│ ├─ python [3.12.0|3.12.1|3.12.2|3.12.3|3.12.4], which can be installed;
│ ├─ python [3.12.0|3.12.1|3.12.2|3.12.3|3.12.4] would require
│ │ └─ python_abi 3.12.* *_cp312, which can be installed;
│ └─ python 3.12.0rc3 would require
│ └─ _python_rc, which does not exist (perhaps a missing channel);
└─ ray-default is not installable because there are no viable options
├─ ray-default [1.10.0|1.11.0|...|2.0.0] would require
│ ├─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.7.* *_cp37m, which conflicts with any installable versions previously reported;
├─ ray-default [1.10.0|1.11.0|...|2.9.3] would require
│ ├─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.8.* *_cp38, which conflicts with any installable versions previously reported;
├─ ray-default [1.10.0|1.11.0|...|2.9.3] would require
│ ├─ python >=3.9,<3.10.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.9.* *_cp39, which conflicts with any installable versions previously reported;
├─ ray-default [1.13.0|2.0.0|...|2.9.3] would require
│ ├─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.10.* *_cp310, which conflicts with any installable versions previously reported;
├─ ray-default [1.5.0|1.5.1|1.5.2|1.6.0] would require
│ ├─ python >=3.6,<3.7.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.6.* *_cp36m, which conflicts with any installable versions previously reported;
├─ ray-default [2.10.0|2.11.0|...|2.9.3] would require
│ ├─ python >=3.11,<3.12.0a0 , which conflicts with any installable versions previously reported;
│ └─ python_abi 3.11.* *_cp311, which conflicts with any installable versions previously reported;
├─ ray-default 2.8.0 would require
│ └─ ray-core 2.8.0 py38h1702d6c_1, which does not exist (perhaps a missing channel);
├─ ray-default [1.6.0|1.9.2|2.0.1] would require
│ └─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported;
├─ ray-default [1.6.0|1.9.2|2.0.1|2.3.0|2.6.3] would require
│ └─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported;
├─ ray-default [1.6.0|1.9.2|2.0.1|2.3.0|2.6.3] would require
│ └─ python >=3.9,<3.10.0a0 , which conflicts with any installable versions previously reported;
├─ ray-default [2.0.1|2.3.0|2.6.3] would require
│ └─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported;
└─ ray-default 2.6.3 would require
└─ python >=3.11,<3.12.0a0 , which conflicts with any installable versions previously reported.
This also fails:
(careless-13)[dhekstra@holy8a24301 reciprocalspaceship]$ pip install -U "ray"
ERROR: Could not find a version that satisfies the requirement ray (from versions: none)
ERROR: No matching distribution found for ray
(careless-13)[dhekstra@holy8a24301 reciprocalspaceship]$ pip install -U "ray[default]"
ERROR: Could not find a version that satisfies the requirement ray[default] (from versions: none)
ERROR: No matching distribution found for ray[default]
(careless-13)[dhekstra@holy8a24301 reciprocalspaceship]$ pip install ray
ERROR: Could not find a version that satisfies the requirement ray (from versions: none)
ERROR: No matching distribution found for ray
@DHekstra , I think some of your packages (careless for sure) do not have python 3.12 support which is confusing the package solver. You should use python 3.11 for now.
This PR adds support for faster stream file parsing which is parallelized using the ray. I did not add ray as a dependency for users, so the code falls back to serial python when it is not available.