mars-project / mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
https://mars-project.readthedocs.io
Apache License 2.0
2.7k stars 326 forks source link

[BUG]raise exception after exit train and predict #2780

Open wuyeguo opened 2 years ago

wuyeguo commented 2 years ago

Describe the bug raise exception after exit state MarsDistributedModel train and predict my code like this

[ray@ml-test ~]$ cat test_mars_sm.py
import ray
ray.init(address="ray://172.16.210.22:10001")

import mars
import mars.tensor as mt
import mars.dataframe as md
session = mars.new_ray_session(worker_num=2, worker_mem=2 * 1024 ** 3)

from sklearn.datasets import load_boston
boston = load_boston()

data = md.DataFrame(boston.data, columns=boston.feature_names)

print("data.head().execute()")
print(data.head().execute())

print("data.describe().execute()")
print(data.describe().execute())

from mars.learn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(data, boston.target, train_size=0.7, random_state=0)

print("X_train: %s" % X_train)

from mars.learn.contrib import statsmodels as msm

model = msm.MarsDistributedModel(num_partitions=5)

print("model.fit")
results = model.fit(y_train, X_train, alpha=0.2)

print("results.predict")
test_r = results.predict(X_test)

print("output:test_r:%s" % type(test_r))
print(test_r)

To Reproduce To help us reproducing this bug, please provide information below:

  1. Your Python version:3.7.5
  2. The version of Mars you use:commit:4474103aa61ffbb08d62481388a3e1c7f9dba98d
  3. Versions of crucial packages, such as numpy, scipy and pandas
    1. Ray:1.9.2
    2. Numpy:1.21.5
    3. Pandas:1.3.5
    4. Scipy:1.7.3
  4. Full stack of the error.
    
    Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f777c5f0>
    Traceback (most recent call last):
    File "/usr/local/python3/lib/python3.7/site-packages/mars/core/entity/executable.py", line 52, in cb
    File "/usr/local/python3/lib/python3.7/concurrent/futures/thread.py", line 163, in submit
    RuntimeError: cannot schedule new futures after shutdown
    Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f7794c20>
    Traceback (most recent call last):
    File "/usr/local/python3/lib/python3.7/site-packages/mars/core/entity/executable.py", line 52, in cb
    File "/usr/local/python3/lib/python3.7/concurrent/futures/thread.py", line 163, in submit
    RuntimeError: cannot schedule new futures after shutdown
    Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f7794950>
    Traceback (most recent call last):
    File "/usr/local/python3/lib/python3.7/site-packages/mars/core/entity/executable.py", line 52, in cb
    File "/usr/local/python3/lib/python3.7/concurrent/futures/thread.py", line 163, in submit
    RuntimeError: cannot schedule new futures after shutdown
    Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f76f0cb0>
    Traceback (most recent call last):
    File "/usr/local/python3/lib/python3.7/site-packages/mars/core/entity/executable.py", line 52, in cb
    File "/usr/local/python3/lib/python3.7/concurrent/futures/thread.py", line 163, in submit
    RuntimeError: cannot schedule new futures after shutdown
    Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f7798a70>
    Traceback (most recent call last):
    File "/usr/local/python3/lib/python3.7/site-packages/mars/core/entity/executable.py", line 52, in cb
    File "/usr/local/python3/lib/python3.7/concurrent/futures/thread.py", line 163, in submit
    RuntimeError: cannot schedule new futures after shutdown

5. Minimized code to reproduce the error.

**Expected behavior**
A clear and concise description of what you expected to happen.

**Additional context**
Add any other context about the problem here.
qinxuye commented 2 years ago
Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f777c5f0>

Looks like it's executed successfully, but Mars tries to do some cleanup job, like deleting data, but the server is closed, it does not effect the result actually.

wuyeguo commented 2 years ago
Exception ignored in: <function _TileableSession.__init__.<locals>.cb at 0x7fc9f777c5f0>

Looks like it's executed successfully, but Mars tries to do some cleanup job, like deleting data, but the server is closed, it does not effect the result actually.

yes, get the right result, but raise exception when exit my program

qinxuye commented 2 years ago

2709 and #2711 seem the same problem, the error does not effect the result, but indeed it makes people confused.

This is kind of enhancement that we should silent those exceptions when server is already closed.