Closed cflerin closed 2 years ago
Hi @cflerin @dweemx !
I have discussed this with several people already in slack, the error comes from the symbolic links Kris made to avoid writing directly in tmp (you may see it too if you don't set the temp_dir in other functions). In principle it shouldn't prompt the error, and just write to the linked folder, I will check it out with Kris. Which core/s have you tried on? I also leave below fast test code:
# Reproducible pyranges error
import pyranges as pr
f1 = pr.from_dict({'Chromosome': ['chr1', 'chr1', 'chr1'], 'Start': [3, 8, 5],
'End': [6, 9, 7], 'Name': ['interval1', 'interval3', 'interval2']})
f2 = pr.from_dict({'Chromosome': ['chr1', 'chr1'], 'Start': [1, 6],
'End': [2, 7], 'Name': ['a', 'b']})
f1.join(f2, nb_cpu=2)
# Error
The best solution for now is to use only 1 core. The running time is very similar to running it with ray anyways (when using ray, there is a bit of time used to initialized the dashboard), so I could also force n_cpu = 1 internally instead of giving it as parameter. The operations we do with pyranges are quite fast, so we could use this to avoid ray altogether in these steps :).
For configuring _temp_dir without ray.init there is not a stable way unfortunately. We could also open an issue in pyranges, it would be minor changes in their multithreaded.py file; or directly in ray I think. The issue is that if _temp_dir=None in ray.init (default, and what pyranges uses) this is the function that is called to determine the _temp_dir:
def get_user_temp_dir():
if sys.platform.startswith("darwin") or sys.platform.startswith("linux"):
# Ideally we wouldn't need this fallback, but keep it for now for
# for compatibility
tempdir = os.path.join(os.sep, "tmp")
else:
tempdir = tempfile.gettempdir()
return tempdir
Which basically forces it to be /tmp/ray, since our system is linux (not happy with their solution though).
So for now, use n_cpu=1 and I'll check with Kris if we can make it work with the symbolic links (this is relevant for other functions too). Otherwise, 1) Avoid ray from pyranges operations (not huge differences in performance anyways), 2) Open issue in pyranges to pass _temp_dir, 3) Open issue in ray to use python tmpdir instead of forcing it to be /tmp/ray (thee 2 latter will depend on the developers).
Cheers, keep you posted!
C
I guess the easiest thing is to just set n_cpu=1
and leave it like that. Since there is little performance hit with this.
When running from the Singularity image, we can force tmp elsewhere with a volume mapping (singularity run -B /new/path/to/tmp:/tmp ...
) and this works well also.
@cflerin Probably worth to add to your cisTopic pycisTopic jupyter kernel config file.
This seems to be solved in the development version at least: https://github.com/ray-project/ray/blob/fabba96fadf833dfeb9d10d9704debc9454f4815/python/ray/_private/utils.py (get_user_temp_dir), so updating ray to this version and setting os.environ['RAY_TMPDIR']
should work well too
Indeed, installing the latest daily release (pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-2.0.0.dev0-cp37-cp37m-manylinux2014_x86_64.whl
, see https://docs.ray.io/en/master/installation.html for other than Python 3.7) and running the code below works now (also with 'TMPDIR' in this release). If you find this a good solution you can close the issue :).
# Reproducible pyranges error
import pyranges as pr
import os
os.putenv('TMPDIR','/scratch/leuven/313/vsc31305/ray_spill')
f1 = pr.from_dict({'Chromosome': ['chr1', 'chr1', 'chr1'], 'Start': [3, 8, 5],
'End': [6, 9, 7], 'Name': ['interval1', 'interval3', 'interval2']})
f2 = pr.from_dict({'Chromosome': ['chr1', 'chr1'], 'Start': [1, 6],
'End': [2, 7], 'Name': ['a', 'b']})
f1.join(f2, nb_cpu=2)
# Works!
Describe the bug When running
create_cistopic_object_from_fragments
withn_cpus>1
, the default temp location for Ray can't be written to. Figured this out with @dweemx today.To Reproduce
Error output
Expected behavior Direct ray temp usage to another location and complete the function.
Screenshots N/A
Version (please complete the following information):
Additional context Apparently this is due to pyranges using Ray for parallelization, but there is no option to re-direct the temp location (that I know of). Possibly we can add the
_temp_dir
parameter and pass this on to pyranges. We did tryos.environ["TMPDIR"] = "/path/to/new/tmp"
but it does not seem to have any effect.