Closed vinodpkd closed 8 years ago
Can I see the first 5 lines of your data file?
Can I see the first 5 lines of your data file?
Here is the data format:
Date,Open,High,Low,Close,Volume 28-Oct-16,53.65,53.84,53.11,53.53,6620333 27-Oct-16,53.60,53.83,53.13,53.59,7899957 26-Oct-16,53.60,53.84,53.36,53.63,5817798 25-Oct-16,54.10,54.17,53.50,53.67,6052830 24-Oct-16,53.90,54.46,53.89,54.18,6919714
One thing that may be an issue is your dates should be in the opposite order. Also, I don't see the variables that you listed in the header of your csv file.
variables = ["SSO", "SSC"] # If you set "SSO" and "SSC", they should be in your header
Try to format your data so it looks like this
Date,Open,High,Low,Close,Volume,SSO,SCC 03/01/2013,27.72,27.98,27.52,27.95,34851872,65.7894736842,-0.121 03/04/2013,27.85,28.15,27.7,28.15,38167504,75.9450171821,0.832 03/05/2013,28.29,28.54,28.16,28.35,41437136,84.9230769231,0.151 03/06/2013,28.21,28.23,27.78,28.09,51448912,80.7799442897,-0.689 03/07/2013,28.11,28.28,28.005,28.14,29197632,73.5368956743,-0.821
Can I make variables = [] instead of variables = ["SSO", "SSC"] I don't understand SSO and SSC. I downloaded the data from google finance.
data.iloc[0] Out[20]: Date 01/02/2013 Open 27.3 High 27.5 Low 27.13 Close 27.5 Volume 13268930 Name: 0, dtype: object
data.iloc[0].round(3) Traceback (most recent call last):
File "
File "C:\Anaconda2\lib\site-packages\pandas\core\series.py", line 1234, in round result = _values_from_object(self).round(decimals, out=out)
data = data.round(3) # Round all values is causing the error. Might it cannot round the date?
vinodpkd, SSO and SSC are custom-designed indicators that the author is using to train his model. At the end of his readme he provides a link to another project that is producing these values. Looks like he's mining social media to provide a social sentiment score.
If you don't understand the values, I think you can use any other data to train your model. For example, p/e ratios if you're more familiar with that indicator. Basically this training data is essential for the model to "learn" how to indicate a buy or sell. It uses a probabilistic classifier to over a training set in order to develop an association between the indicators and a buy or sell recommendation. So I think before you can continue any further, you have to provide at least some data as the indicator.
Exactly, for example I see your dataset contains the header: "Date,Open,High,Low,Close,Volume"
You could then do something like variables = ["High", "Low"] and the program will attempt to learn when to buy and when to sell based on the High and Low values of your data. Note that this is probably not going to work because the raw High/Low values alone are not very predictive of how the stock price will move. Therefore, you should try different indicators, like P/E for example.
"Date,Open,High,Low,Close,Volume,P/E"
Hope this helps!
from clairvoyant import Backtest from pandas import read_csv
Testing performance on a single stock
variables = ["SSO", "SSC"] # Financial indicators of choice trainStart = '2013-03-01' # Start of training period trainEnd = '2015-07-15' # End of training period testStart = '2015-07-16' # Start of testing period testEnd = '2016-07-16' # End of training period buyThreshold = 0.65 # Confidence threshold for predicting buy (default = 0.65) sellThreshold = 0.65 # Confidence threshold for predicting sell (default = 0.65) C = 1 # Penalty parameter (default = 1) gamma = 10 # Kernel coefficient (default = 10) continuedTraining = False # Continue training during testing period? (default = false)
backtest = Backtest(variables, trainStart, trainEnd, testStart, testEnd)
data = read_csv(r"H:/Python/ClairVoyantTest/sbux.csv") # Read in data data = data.round(3) # Round all values
backtest.stocks.append("SBUX") # Inform the model which stock is being tested for i in range(0,10): # Run the model 10-15 times
backtest.runModel(data)
backtest.displayConditions() backtest.displayStats()
I have run the above code: The issue coming was
File "", line 1, in runfile('H:/Python/StockPredictionUsingClairVoyant.py', wdir='H:/Python')
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile execfile(filename, namespace)
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)
File "H:/Python/StockPredictionUsingClairVoyant.py", line 30, in backtest.runModel(data)
File "C:\Anaconda2\lib\site-packages\clairvoyant\Backtest.py", line 72, in runModel data['Date'] = to_datetime(data['Date'])
File "C:\Anaconda2\lib\site-packages\pandas\core\frame.py", line 1914, in getitem return self._getitem_column(key)
File "C:\Anaconda2\lib\site-packages\pandas\core\frame.py", line 1921, in _getitem_column return self._get_item_cache(key)
File "C:\Anaconda2\lib\site-packages\pandas\core\generic.py", line 1090, in _get_item_cache values = self._data.get(item)
File "C:\Anaconda2\lib\site-packages\pandas\core\internals.py", line 3102, in get loc = self.items.get_loc(item)
File "C:\Anaconda2\lib\site-packages\pandas\core\index.py", line 1692, in get_loc return self._engine.get_loc(_values_from_object(key))
File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:3979)
File "pandas\index.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandas\index.c:3843)
File "pandas\hashtable.pyx", line 668, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12265)
File "pandas\hashtable.pyx", line 676, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12216)
KeyError: 'Date'
runfile('H:/Python/StockPredictionUsingClairVoyant.py', wdir='H:/Python') Traceback (most recent call last):
File "", line 1, in runfile('H:/Python/StockPredictionUsingClairVoyant.py', wdir='H:/Python')
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile execfile(filename, namespace)
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)
File "H:/Python/StockPredictionUsingClairVoyant.py", line 27, in data = data.round(3) # Round all values
File "C:\Anaconda2\lib\site-packages\pandas\core\frame.py", line 4335, in round new_cols = [np.round(self[col], decimals) for col in self]
File "C:\Anaconda2\lib\site-packages\numpy\core\fromnumeric.py", line 2782, in round_ return round(decimals, out)
File "C:\Anaconda2\lib\site-packages\pandas\core\series.py", line 1234, in round result = _values_from_object(self).round(decimals, out=out)
TypeError: can't multiply sequence by non-int of type 'float'
What would be the cause of the error.