Open dmitroboggy opened 6 years ago
Ah yeah, this has been an issue forever. In my experience, it helps to seed the initial values with something vaguely reasonable, but I think this is ultimately down to a deeper problem that I need to look into.
Thanks for bringing this up.
Ok, where in the code can I seed the initial values?
There's no easy way using the command line arguments at the moment, although it shouldn't be too difficult to add in (https://github.com/Torvaney/soccerstan/blob/254018e1eab9e428213a9b352b9dd8cd548e401e/src/soccerstan.py#L166).
I'd suggest in the meantime perhaps using Karlis-Ntzoufras's model which gives comparable attack and defence parameters and fits much better
Unfortunately, Karlis-Ntzoufras lasts for several hours too, but in the other way than Dixon-Coles. I guess reducing the number of iterations could do the trick with karlis-ntzoufras
Wow! Let me check how long it takes for me. What data are you using?
Probably also worth asking what package versions you're using. Could you paste a pip freeze
here as well, please?
i used yours example.csv as well as the attached bundesliga file. bundesliga.txt
here is the pip freeze
output:
alabaster==0.7.7 apsw==3.8.11.1.post1 attrs==15.2.0 Babel==1.3 backports.shutil-get-terminal-size==1.0.0 backports.weakref==1.0.post1 BeautifulSoup==3.2.1 bleach==1.5.0 blinker==1.3 certifi==2017.7.27.1 chardet==3.0.4 configglue==1.1.2 configobj==5.0.6 configparser==3.5.0 cryptography==1.2.3 cycler==0.10.0 Cython==0.28.5 debtags==2.0 decorator==4.0.6 defer==1.0.6 dirspec==13.10 docutils==0.12 duplicity==0.7.6 ecdsa==0.13 entrypoints==0.2.3 enum34==1.1.6 feedparser==5.1.3 funcsigs==1.0.2 functools32==3.2.3.post2 future==0.16.0 futures==3.1.1 html5lib==0.9999999 httplib2==0.9.1 idna==2.6 ipaddress==1.0.16 ipykernel==4.6.1 ipython==5.5.0 ipython-genutils==0.2.0 ipywidgets==7.0.1 jdcal==1.0 Jinja2==2.8 joblib==0.12.3 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.1.0 jupyter-console==5.2.0 jupyter-core==4.3.0 Keras==2.0.9 line-profiler==2.0 llvmlite==0.23.2 lockfile==0.12.2 lxml==3.5.0 M2Crypto==0.22.6rc4 Mako==1.0.3 Markdown==2.6.9 MarkupSafe==0.23 matplotlib==2.0.0 mistune==0.7.4 mock==2.0.0 nbconvert==5.3.1 nbformat==4.4.0 nemo-emblems==3.4.1 netifaces==0.10.4 notebook==5.1.0 np-utils==0.5.3.4 numba==0.38.1 numexpr==2.4.3 numpy==1.13.3 oauthlib==1.0.3 oneconf==0.3.9 openpyxl==2.3.0 PAM==0.4.2 pandas==0.18.1 pandas-datareader==0.5.0 pandocfilters==1.4.2 paramiko==1.16.0 pathlib2==2.3.0 patsy==0.5.0 pbr==3.1.1 pexpect==4.0.1 pickleshare==0.7.4 Pillow==3.1.2 piston-mini-client==0.7.5 prompt-toolkit==1.0.15 protobuf==3.4.0 ptyprocess==0.5 pyasn1==0.1.9 pyasn1-modules==0.0.7 pycrypto==2.6.1 pycups==1.9.73 pycurl==7.43.0 Pygments==2.1 pygobject==3.20.0 pyinotify==0.9.6 PyJWT==1.3.0 pyneurgen==0.3.1 PyOpenGL==3.0.2 pyOpenSSL==0.15.1 pyparsing==2.0.3 pyparted==3.10.7 Pyrex==0.9.8.5 pyserial==3.0.1 pysmbc==1.0.15.5 pystan==2.18.0.0 python-apt==1.1.0b1+ubuntu0.16.4.2 python-dateutil==2.4.2 python-debian==0.1.27 python-xlib==0.14 pytz==2014.10 pyxdg==0.25 PyYAML==3.12 pyzmq==15.2.0 qtconsole==4.3.1 reportlab==3.3.0 requests==2.18.4 requests-file==1.4.2 requests-ftp==0.3.1 roman==2.0.0 rope==0.10.2 scandir==1.5 scikit-learn==0.19.1 scipy==0.17.0 service-identity==16.0.0 setproctitle==1.1.8 simplegeneric==0.8.1 singledispatch==3.4.0.3 six==1.10.0 sklearn==0.0 Sphinx==1.3.6 sphinx-rtd-theme==0.1.9 spyder==2.3.8 subprocess32==3.5.2 tables==3.2.2 tensorflow==1.4.0 tensorflow-tensorboard==0.4.0rc2 terminado==0.6 testpath==0.3.1 tornado==4.2.1 traitlets==4.3.2 Twisted==16.0.0 urllib3==1.22 uTidylib==0.2 wcwidth==0.1.7 webencodings==0.5.1 Werkzeug==0.12.2 widgetsnbextension==3.0.3 xlrd==0.9.4 zope.interface==4.1.3
Hmmm, it runs fine on pystan 2.14 for me and pretty quickly (< 5 mins excluding compilation on a 4GB laptop), but I suspect it might work better with some changes in pystan 2.18 (what you're using based on the pip freeze output).
I'll have a look at upgrading everything to the latest version of pystan, but that probably won't be until the weekend. In the meantime, you could try using 2.14 (pip install pystan==2.14.0
) and seeing if that helps.
Karlis-Ntzoufras lasts for several hours too
This seems really surprising to me, especially for 1 season of data. Do you mind me asking what the rest of your set up looks like? Operating system, RAM?
system To be filled by O.E.M. (To be filled by O.E.M.)
/0 bus 970A-UD3P
/0/0 memory 64KiB BIOS
/0/4 processor AMD FX(tm)-6300 Six-Core Processor
/0/4/5 memory 288KiB L1 cache
/0/4/6 memory 6MiB L2 cache
/0/4/7 memory 8MiB L3 cache
/0/2c memory 8GiB System Memory
/0/2c/0 memory 4GiB DIMM DDR3 Synchronous 667 MHz (1,5 ns)
/0/2c/1 memory DIMM Synchronous [empty]
/0/2c/2 memory 4GiB DIMM DDR3 Synchronous 667 MHz (1,5 ns)
/0/2c/3 memory DIMM Synchronous [empty]
/0/100 bridge RD890 PCI to PCI bridge (external gfx0 port B)
/0/100/0.2 generic RD990 I/O Memory Management Unit (IOMMU)
/0/100/2 bridge RD890 PCI to PCI bridge (PCI express gpp port B)
/0/100/2/0 display Oland XT [Radeon HD 8670 / R7 250/350]
/0/100/2/0.1 multimedia Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
/0/100/9 bridge RD890 PCI to PCI bridge (PCI express gpp port H)
/0/100/9/0 enp2s0 network RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
/0/100/11 storage SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]
/0/100/12 bus SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
/0/100/12/1 usb4 bus OHCI PCI host controller
/0/100/12/1/1 input USB Receiver
/0/100/12/1/2 input 2.4G Keyboard Mouse
/0/100/12.2 bus SB7x0/SB8x0/SB9x0 USB EHCI Controller
/0/100/12.2/1 usb1 bus EHCI Host Controller
/0/100/13 bus SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
/0/100/13/1 usb5 bus OHCI PCI host controller
/0/100/13.2 bus SB7x0/SB8x0/SB9x0 USB EHCI Controller
/0/100/13.2/1 usb2 bus EHCI Host Controller
/0/100/14 bus SBx00 SMBus Controller
/0/100/14.2 multimedia SBx00 Azalia (Intel HDA)
/0/100/14.3 bridge SB7x0/SB8x0/SB9x0 LPC host controller
/0/100/14.4 bridge SBx00 PCI to PCI Bridge
/0/100/14.5 bus SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
/0/100/14.5/1 usb6 bus OHCI PCI host controller
/0/100/16 bus SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
/0/100/16/1 usb7 bus OHCI PCI host controller
/0/100/16/1/4 input USB Receiver
/0/100/16.2 bus SB7x0/SB8x0/SB9x0 USB EHCI Controller
/0/100/16.2/1 usb3 bus EHCI Host Controller
/0/101 bridge Family 15h Processor Function 0
/0/102 bridge Family 15h Processor Function 1
/0/103 bridge Family 15h Processor Function 2
/0/104 bridge Family 15h Processor Function 3
/0/105 bridge Family 15h Processor Function 4
/0/106 bridge Family 15h Processor Function 5
/0/1 scsi0 storage
/0/1/0.0.0 /dev/cdrom disk CDDVDW SH-224DB
/0/2 scsi2 storage
/0/2/0.0.0 /dev/sda disk 1TB WDC WD10EZRX-00A
/0/2/0.0.0/1 /dev/sda1 volume 931GiB Windows NTFS volume
/0/3 scsi3 storage
/0/3/0.0.0 /dev/sdb disk 126GB SanDisk SDSSDP12
/0/3/0.0.0/1 /dev/sdb1 volume 62GiB Extended partition
/0/3/0.0.0/1/5 /dev/sdb5 volume 41GiB Linux filesystem partition
/0/3/0.0.0/1/6 /dev/sdb6 volume 335MiB Linux filesystem partition
/0/3/0.0.0/1/7 /dev/sdb7 volume 12GiB Linux filesystem partition
/0/3/0.0.0/2 /dev/sdb2 volume 350MiB Windows NTFS volume
/0/3/0.0.0/3 /dev/sdb3 volume 51GiB Windows NTFS volume
/0/3/0.0.0/4 /dev/sdb4 volume 2784MiB Linux swap volume
Anyway, can you explain in the meantime, how do I interpret the output?
Hi! After executing
python soccerstan.py '../data/example.csv' 'dixon-coles'
i get following output:Full output here: log.txt After that nothing more happens for hours. Do the initial values cause the problem and what can be done here? Thanks in advance!
Python 2.7.12 numpy 1.13.3 pandas 0.18.1 pystan 2.18.0.0 matplotlib 1.5.1