Read this in other languages: 한국어.
In this Code Pattern we will use a Jupyter notebook to showcase an example of machine learning with time series on IBM Power8 systems. The notebook will focus on evalulating the predictability of future financial market values in the "renewable energy" sector by examining related markets and sentiment detected in New York Times news articles.
When the reader has completed this Code Pattern, they will understand how to:
The intended audience for this Code Pattern is application developers who need to efficiently build powerful deep learning applications, but who may not have an abundance of time or data science experience.
Follow these steps to setup and run this Code Pattern. The steps are described in detail below.
IBM has partnered with Nimbix to provide cognitive developers a trial account that provides 24-hours of free processing time on the PowerAI platform. Follow these steps to register for access to Nimbix to try the PowerAI Cognitive Code Patterns and explore the platform.
Go to the IBM Marketplace PowerAI Portal, and click Request Trial
.
On the IBM PowerAI Trial page, shown below, enter the required information to sign up for an IBM account and click Continue
. If you already have an IBM ID, click Already have an account? Log in
, enter your credentials and click Continue
.
On the Almost there… page, shown below, enter the required information and click Continue
to complete the registration and launch the IBM Marketplace Products and Services page.
Your IBM Marketplace Products and Services page displays all offerings that are available to you; the PowerAI Trial should now be one of them. From the PowerAI Trial section, click Launch
, as shown below, to launch the IBM PowerAI trial page.
The Welcome to IBM PowerAI Trial page provides instructions for accessing the trial, as shown below. Alternatively, you will receive an email confirming your registration with similar instructions that you can follow to start the trial.
Summary of steps for starting the trial:
Start a terminal session from your local machine and issue the following command where {IP Address}
is the IP Address (or host name) shown on the welcome page (or in the confirmation email).
ssh -L 8888:localhost:8888 nimbix@{IP Address}
Enter the password shown on the welcome page (or in the confirmation email) when prompted.
From your local browser, go to the following URL to get started: http://localhost:8888/tree/.
Use git clone to download the example notebook and data with a single command.
New
pull-down and selecting Terminal
.git clone https://github.com/IBM/powerai-notebook
Files
tab and click on powerai-notebook
then notebooks
and then Clean_Energy_Watson_V1.0.ipynb
to open the notebook.When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.
Each code cell is selectable and is preceded by a tag in the left margin. The tag
format is In [x]:
. Depending on the state of the notebook, the x
can be:
*
, this indicates that the cell is currently executing.There are several ways to execute the code cells in your notebook:
Play
button in the toolbar.Cell
menu bar, there are several options available. For example, you
can Run All
cells in your notebook, or you can Run All Below
, that will
start executing from the first cell under the currently selected cell, and then
continue executing all cells that follow.Notes:
Regarding cell [4]
: For the Code Pattern we import already collected
stock market data. This can be done inside the notebook, but requires
access to private financial websites (such as Bloomberg), which requires
a subscription fee.
Regarding cell [5]
: In an effort to speed up the
notebook processing time, the New York Times data has already
been collected and stored in a JSON file, and is imported by the notebook.
The Code Pattern is based on the original Google Cloud Platform example documented at https://cloud.google.com/solutions/machine-learning-with-financial-time-series-data. The difference between this "IBM Demo" and the original "Google Demo" is noted in the following table:
The result of running the notebook is a report which may be shared with or without sharing the code. You can share the code for an audience that wants to see how you came your conclusions. The text, code and output/charts are combined in a single web page. For an audience that does not want to see the code, you can share a web page that only shows text and output/charts.
The graphs and charts produced in this Code Pattern attempt to prove that the closing value of the Nasdaq Clean Energy Index can be predicted by examining various input sources, such as the New York Times and other financial markets, both foreign and domestic. These markets include:
The notebook begins by collecting and formatting data:
Collect and merge 3 years of stock market financial data.
Collect 3 years of "green energy" articles from the New York Times. This data is then feed into the Watson Natural Language Understanding service to gather sentiment analysis - specifically by assigning a relative positive or negative score to each article.
The notebook then utilizes EDA (exploratory data analysis) methods to find correlations in the data. These findings include:
The final analysis from the EDA are as follows:
After determining this correlation in the data, the notebook then uses TensorFlow and the IBM PowerAI machine learning framework to train and test the data.
After hundreds of thousands of iterations over the data using multiple models, the notebook is able to achieve a 70% success rate for predicting whether the Nasdaq Energy Index would close up or down on any given day.
Because this notebook is running temporarily on a Nimbix Cloud server, the options to saving and sharing the notebook are limited.
Under the File
menu, there are options to:
Download as...
will download the notebook to your local system.Print Preview
will allow you to print the current state of the
notebook.When you are done with your work, please cancel your subscription by issuing the following command in your ssh session or by visiting the Manage
link on the My Products and Services page.
sudo poweroff --force
This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.