hosokawa-taiji commented 5 years ago

I executed run_ppo.py, but RuntimeError: size mismatch occurred. A few weeks ago, It worked fine. On 13,Nov THTensorMath.cpp in Pytorch was changed something. I think this is the cause. Can you handle this?

hosokawa-taiji commented 5 years ago

It has occurred only my original gym environment. I have a problem with my environment, maybe.

mmisono commented 5 years ago

Could you describe the details? That is,

The minimal reproducible code
The output of pip freeze
OS version
Nvidia driver version
CUDA version

hosokawa-taiji commented 5 years ago

I'm using google colaboratory.According to the investigation,

The output of pip freeze is

absl-py==0.8.1
alabaster==0.7.12
albumentations==0.1.12
altair==3.2.0
astor==0.8.0
astropy==3.0.5
atari-py==0.2.6
atomicwrites==1.3.0
attrs==19.3.0
audioread==2.1.8
autograd==1.3
Babel==2.7.0
backcall==0.1.0
backports.tempfile==1.0
backports.weakref==1.0.post1
beautifulsoup4==4.6.3
bleach==3.1.0
blis==0.2.4
bokeh==1.0.4
boto==2.49.0
boto3==1.10.14
botocore==1.13.14
Bottleneck==1.2.1
branca==0.3.1
bs4==0.0.1
bz2file==0.98
cached-property==1.5.1
cachetools==3.1.1
certifi==2019.9.11
cffi==1.13.2
chainer==6.5.0
chardet==3.0.4
chart-studio==1.0.0
Click==7.0
cloudpickle==1.2.2
cmake==3.12.0
colorama==0.4.1
colorlover==0.3.0
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.2.0
coverage==3.7.1
coveralls==0.5
crcmod==1.7
cufflinks==0.17.0
cupy-cuda100==6.5.0
cvxopt==1.2.3
cvxpy==1.0.25
cycler==0.10.0
cymem==2.0.2
Cython==0.29.14
daft==0.0.4
dask==1.1.5
dataclasses==0.7
datascience==0.10.6
decorator==4.4.1
defusedxml==0.6.0
descartes==1.1.0
dill==0.3.1.1
distributed==1.25.3
Django==2.2.7
dlib==19.16.0
dm-sonnet==1.35
docopt==0.6.2
docutils==0.15.2
dopamine-rl==1.0.5
earthengine-api==0.1.205
easydict==1.9
ecos==2.0.7.post1
editdistance==0.5.3
en-core-web-sm==2.1.0
entrypoints==0.3
et-xmlfile==1.0.1
fa2==0.3.5
fancyimpute==0.4.3
fastai==1.0.59
fastcache==1.1.0
fastdtw==0.3.4
fastprogress==0.1.21
fastrlock==0.4
fbprophet==0.5
feather-format==0.4.0
featuretools==0.4.1
filelock==3.0.12
fix-yahoo-finance==0.0.22
Flask==1.1.1
folium==0.8.3
fsspec==0.5.2
funcsigs==1.0.2
future==0.16.0
gast==0.2.2
GDAL==2.2.2
gdown==3.6.4
gensim==3.6.0
geographiclib==1.50
geopy==1.17.0
gevent==1.4.0
gin-config==0.2.1
glob2==0.7
google==2.0.2
google-api-core==1.14.3
google-api-python-client==1.7.11
google-auth==1.4.2
google-auth-httplib2==0.0.3
google-auth-oauthlib==0.4.1
google-cloud-bigquery==1.21.0
google-cloud-core==1.0.3
google-cloud-datastore==1.8.0
google-cloud-language==1.2.0
google-cloud-storage==1.16.2
google-cloud-translate==1.5.0
google-colab==1.0.0
google-pasta==0.1.8
google-resumable-media==0.4.1
googleapis-common-protos==1.6.0
googledrivedownloader==0.4
graph-nets==1.0.5
graphviz==0.10.1
greenlet==0.4.15
grpcio==1.15.0
gspread==3.0.1
gspread-dataframe==3.0.3
gunicorn==20.0.0
gym==0.15.4
h5py==2.8.0
HeapDict==1.0.1
holidays==0.9.11
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.11.3
humanize==0.5.1
hyperopt==0.1.2
ideep4py==2.0.0.post3
idna==2.8
image==1.5.27
imageio==2.4.1
imagesize==1.1.0
imbalanced-learn==0.4.3
imblearn==0.0
imgaug==0.2.9
importlib-metadata==0.23
imutils==0.5.3
inflect==2.1.0
intel-openmp==2019.0
intervaltree==2.1.0
ipykernel==4.6.1
ipython==5.5.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.5.1
itsdangerous==1.1.0
jax==0.1.50
jaxlib==0.1.32
jdcal==1.4.1
jedi==0.15.1
jieba==0.39
Jinja2==2.10.3
jmespath==0.9.4
joblib==0.14.0
jpeg4py==0.1.4
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==5.2.0
jupyter-core==4.6.1
kaggle==1.5.6
kapre==0.1.3.1
Keras==2.2.5
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
keras-vis==0.4.1
kfac==0.2.0
kiwisolver==1.1.0
knnimpute==0.1.0
librosa==0.6.3
lightgbm==2.2.3
llvmlite==0.30.0
lmdb==0.98
lucid==0.3.8
lunardate==0.2.0
lxml==4.2.6
machina-rl==0.2.1
magenta==0.3.19
Markdown==3.1.1
MarkupSafe==1.1.1
matplotlib==3.1.1
matplotlib-venn==0.11.5
mesh-tensorflow==0.1.4
mido==1.2.6
mir-eval==0.5
missingno==0.4.2
mistune==0.8.4
mizani==0.5.4
mkl==2019.0
mlxtend==0.14.0
more-itertools==7.2.0
moviepy==0.2.3.5
mpi4py==3.0.3
mpmath==1.1.0
msgpack==0.5.6
multiprocess==0.70.9
multitasking==0.0.9
murmurhash==1.0.2
music21==5.5.0
natsort==5.5.0
nbconvert==5.6.1
nbformat==4.4.0
networkx==2.4
nibabel==2.3.3
nltk==3.2.5
notebook==5.2.2
np-utils==0.5.11.1
numba==0.40.1
numexpr==2.7.0
numpy==1.17.4
nvidia-ml-py3==7.352.0
oauth2client==4.1.3
oauthlib==3.1.0
okgrade==0.4.3
olefile==0.46
opencv-contrib-python==3.4.3.18
opencv-python==3.4.7.28
openpyxl==2.5.9
opt-einsum==3.1.0
osqp==0.6.1
packaging==19.2
palettable==3.3.0
pandas==0.25.3
pandas-datareader==0.7.4
pandas-gbq==0.11.0
pandas-profiling==1.4.1
pandocfilters==1.4.2
parso==0.5.1
pathlib==1.0.1
patsy==0.5.1
pexpect==4.7.0
pickleshare==0.7.5
Pillow==4.3.0
pip-tools==4.2.0
plac==0.9.6
plotly==4.1.1
plotnine==0.5.1
pluggy==0.7.1
portpicker==1.2.0
prefetch-generator==1.0.1
preshed==2.0.1
pretty-midi==0.2.8
prettytable==0.7.2
progressbar2==3.38.0
prometheus-client==0.7.1
promise==2.2.1
prompt-toolkit==1.0.18
protobuf==3.10.0
psutil==5.4.8
psycopg2==2.7.6.1
ptyprocess==0.6.0
py==1.8.0
py-spy==0.3.0
pyarrow==0.14.1
pyasn1==0.4.7
pyasn1-modules==0.2.7
pycocotools==2.0.0
pycparser==2.19
pydata-google-auth==0.1.3
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyglet==1.3.2
Pygments==2.1.3
pygobject==3.26.1
pymc3==3.7
PyMeeus==0.3.6
pymongo==3.9.0
pymystem3==0.2.0
PyOpenGL==3.1.0
pyparsing==2.4.5
pypng==0.0.20
pyrsistent==0.15.5
pysndfile==1.3.8
PySocks==1.7.1
pystan==2.19.1.1
pytest==3.6.4
python-apt==1.6.4
python-chess==0.23.11
python-dateutil==2.6.1
python-louvain==0.13
python-rtmidi==1.3.1
python-slugify==4.0.0
python-utils==2.3.0
pytz==2018.9
PyWavelets==1.1.1
PyYAML==3.13
pyzmq==17.0.0
qtconsole==4.5.5
ray==0.7.6
redis==3.3.11
requests==2.21.0
requests-oauthlib==1.3.0
resampy==0.2.2
retrying==1.3.3
rpy2==2.9.5
rsa==4.0
s3fs==0.3.5
s3transfer==0.2.1
scikit-image==0.15.0
scikit-learn==0.21.3
scipy==1.3.2
screen-resolution-extra==0.0.0
scs==2.1.1.post2
seaborn==0.9.0
semantic-version==2.8.2
Send2Trash==1.5.0
setproctitle==1.1.10
setuptools-git==1.2
Shapely==1.6.4.post2
simplegeneric==0.8.1
six==1.12.0
sklearn==0.0
sklearn-pandas==1.8.0
smart-open==1.9.0
snowballstemmer==2.0.0
sortedcontainers==2.1.0
spacy==2.1.9
Sphinx==1.8.5
sphinxcontrib-websupport==1.1.2
SQLAlchemy==1.3.10
sqlparse==0.3.0
srsly==0.2.0
stable-baselines==2.2.1
statsmodels==0.10.1
sympy==1.1.1
tables==3.4.4
tabulate==0.8.5
tblib==1.5.0
tensor2tensor==1.14.1
tensorboard==1.15.0
tensorboardcolab==0.0.22
tensorflow==1.15.0
tensorflow-datasets==1.3.0
tensorflow-estimator==1.15.1
tensorflow-gan==2.0.0
tensorflow-hub==0.7.0
tensorflow-metadata==0.15.0
tensorflow-privacy==0.2.2
tensorflow-probability==0.7.0
termcolor==1.1.0
terminado==0.8.2
terminaltables==3.1.0
testpath==0.4.4
text-unidecode==1.3
textblob==0.15.3
textgenrnn==1.4.1
tflearn==0.3.2
Theano==1.0.4
thinc==7.0.8
toolz==0.10.0
torch==1.3.1+cu100
torchsummary==1.5.1
torchtext==0.3.1
torchvision==0.4.2+cu100
tornado==4.5.3
tqdm==4.28.1
traitlets==4.3.3
tweepy==3.6.0
typing==3.6.6
typing-extensions==3.6.6
tzlocal==1.5.1
umap-learn==0.3.10
uritemplate==3.0.0
urllib3==1.24.3
vega-datasets==0.7.0
wasabi==0.4.0
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.16.0
widgetsnbextension==3.5.1
wordcloud==1.5.0
wrapt==1.11.2
xarray==0.11.3
xgboost==0.90
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yellowbrick==0.9.1
zict==1.0.0
zipp==0.6.0
zmq==0.0.0

hosokawa-taiji commented 5 years ago

OS version is Ubuntu 18.04.3 LTS
Nvidia driver version is(Is this corrrect?) /usr/lib64-nvidia/libcuda.so /usr/lib64-nvidia/libcuda.so.418.67 /usr/lib64-nvidia/libcuda.so.1
CUDA version is nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Sat_Aug_25_21:08:01_CDT_2018 Cuda compilation tools, release 10.0, V10.0.130

hosokawa-taiji commented 5 years ago

The minimal reproducible code is


import gym
from gym import spaces
import numpy as np
import pandas as pd
from sklearn import preprocessing

class FxEnv_v0(gym.Env):

定数

WINDOW = 600 PCT_CHANGE = 600 HP = 100 THRESHOLD = 3

def init(self): super().init()

データの読込・変換

self.data = self.loadAndTransformData()
# action_spaceを設定する
self.action_space = gym.spaces.Box(
  low = -10,
  high = 10,
  shape = (1,),
  dtype = np.int64
)
# observation_spaceを設定する
self.observation_space = gym.spaces.Box(
  low = -10,
  high = 10,
  shape = ((self.WINDOW,)),
  dtype = np.float64
)
# reward_range を設定する
self.reward_range = [-2., 10.]
# 初期設定
self.position = 0
self.hp = self.HP
self.reset()

def step(self, action):

reward,hpの計算

reward = 0
nextData = self.data[self.position + self.WINDOW].round()
if(action == nextData):
  reward = abs(action)
else:
  reward = -0.1 * (abs(action - nextData)[0])
  self.hp -= 1
if(action == 0 and nextData == 0):
  reward = 1
# positionの設定
self.position += 1
return self.getObservation(), reward, self.isDone(), {}

def reset(self):

HPの設定

self.hp = self.HP
# observationの設定
return self.getObservation()

def render(self, mode='human', close=False): pass

def close(self): pass

def seed(self, seed=None): pass

def loadAndTransformData(self): data = pd.read_csv('drive/My Drive/Colab Notebooks/GBPJPY1.csv', names=(['Bid'])) data = data.rolling(self.WINDOW).mean() data = data.pct_change(self.PCT_CHANGE).dropna() scaler = preprocessing.MinMaxScaler(feature_range=(-10, 10)) return scaler.fit_transform(data)

def getObservation(self): return self.data[self.position : self.position + self.WINDOW]

def isDone(self): if(self.hp <= 0): return True elif(self.position == len(self.data) - self.WINDOW): self.position = 0 return True else: return False

hosokawa-taiji commented 5 years ago

And I ran this command python machina/example/run_ppo.py --cuda 0 --env_name 'Fx-v0' --rnn then I got this error massage.

{'batch_size': 256,
 'c2d': False,
 'clip_param': 0.2,
 'cuda': 0,
 'env_name': 'Fx-v0',
 'epoch_per_iter': 10,
 'gamma': 0.995,
 'init_kl_beta': 1,
 'kl_targ': 0.01,
 'lam': 1,
 'log': 'garbage',
 'max_epis': 1000000,
 'max_grad_norm': 10,
 'max_steps_per_iter': 10000,
 'num_parallel': 4,
 'pol_lr': 0.0003,
 'ppo_type': 'clip',
 'record': False,
 'rnn': True,
 'rnn_batch_size': 8,
 'seed': 256,
 'vf_lr': 0.0003}
2019-11-15 09:12:11.192617 UTC | observation space: Box(600,)
2019-11-15 09:12:11.192765 UTC | action space: Box(1,)
Process Process-4:
Process Process-3:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 126, in mp_sample
    l, epi = one_epi(env, pol, deterministic_flag, prepro)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 51, in one_epi
    ac_real, ac, a_i = pol(torch.tensor(o, dtype=torch.float))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/pols/gaussian_pol.py", line 50, in forward
    mean, log_std, hs = self.net(obs, hs, h_masks)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/machina/example/simple_net.py", line 169, in forward
    xs = torch.relu(self.input_layer(xs))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1372, in linear
    output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [600 x 1], m2: [600 x 256] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 126, in mp_sample
    l, epi = one_epi(env, pol, deterministic_flag, prepro)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 51, in one_epi
    ac_real, ac, a_i = pol(torch.tensor(o, dtype=torch.float))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/pols/gaussian_pol.py", line 50, in forward
    mean, log_std, hs = self.net(obs, hs, h_masks)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/machina/example/simple_net.py", line 169, in forward
    xs = torch.relu(self.input_layer(xs))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1372, in linear
    output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [600 x 1], m2: [600 x 256] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 126, in mp_sample
    l, epi = one_epi(env, pol, deterministic_flag, prepro)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 51, in one_epi
    ac_real, ac, a_i = pol(torch.tensor(o, dtype=torch.float))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/pols/gaussian_pol.py", line 50, in forward
    mean, log_std, hs = self.net(obs, hs, h_masks)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/machina/example/simple_net.py", line 169, in forward
    xs = torch.relu(self.input_layer(xs))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1372, in linear
    output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [600 x 1], m2: [600 x 256] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197
Process Process-5:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 126, in mp_sample
    l, epi = one_epi(env, pol, deterministic_flag, prepro)
  File "/usr/local/lib/python3.6/dist-packages/machina/samplers/epi_sampler.py", line 51, in one_epi
    ac_real, ac, a_i = pol(torch.tensor(o, dtype=torch.float))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/machina/pols/gaussian_pol.py", line 50, in forward
    mean, log_std, hs = self.net(obs, hs, h_masks)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/machina/example/simple_net.py", line 169, in forward
    xs = torch.relu(self.input_layer(xs))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1372, in linear
    output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [600 x 1], m2: [600 x 256] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197

mmisono commented 5 years ago

self.observation.space.shape is (WINDOW,) but the actual observation space returned by the environment is (WINDOW, 1). Please give the following a try.

  def getObservation(self):
    return self.data[self.position : self.position + self.WINDOW].flatten()

hosokawa-taiji commented 5 years ago

It works! A lot of thanks!!!

DeepX-inc / machina

Executed run_ppo.py,RuntimeError: size mismatch occurred. #258

定数

データの読込・変換

reward,hpの計算

HPの設定