IBM / pixiedust-facebook-analysis

A Jupyter notebook that uses the Watson Visual Recognition and Natural Language Understanding services to enrich Facebook Analytics and uses Cognos Dashboard Embedded to explore and visualize the results in Watson Studio
https://developer.ibm.com/patterns/discover-hidden-facebook-usage-insights/
Apache License 2.0
43 stars 64 forks source link
data-science enriched-data ibm-developer-technology-cognitive ibmcode jupyter-notebook natural-language notebook pandas-dataframe unstructured-data watson watson-api watson-apis watson-natural-language watson-services watson-studio watson-visual-recognition

Build Status

Uncover insights from Facebook data with Watson services

WARNING: This repository is no longer maintained.

This repository will not be updated. The repository will be kept available in read-only mode.

In this code pattern, we will use a Jupyter notebook with Watson Studio to glean insights from a vast body of unstructured data. We'll start with data exported from Facebook Analytics. We'll use Watson’s Natural Language Understanding and Visual Recognition to enrich the data.

We'll use the enriched data to answer questions like:

What emotion is most prevalent in the posts with the highest engagement?

What sentiment has the higher engagement score on average?

What are the top keywords, entities or images measured by total reach?

These types of insights are especially beneficial for marketing analysts who are interested in understanding and improving brand perception, product performance, customer satisfaction, and ways to engage their audiences.

It is important to note that this code pattern is meant to be used as a guided experiment, rather than an application with one set output. The standard Facebook Analytics export features text from posts, articles, and thumbnails, along with standard Facebook performance metrics such as likes, shares, and impressions. This unstructured content was then enriched with Watson APIs to extract keywords, entities, sentiment, and emotion.

After the data is enriched with Watson APIs, we'll use the Cognos Dashboard Embedded service to add a dashboard to the project. Using the dashboard you can explore our results and build your own sophisticated visualizations to communicate the insights you've discovered.

This code pattern provides mock Facebook data, a notebook, and comes with several pre-built visualizations to jump start you with uncovering hidden insights.

When the reader has completed this code pattern, they will understand how to:

Flow

architecture

  1. A CSV file exported from Facebook Analytics is added to Object Storage.
  2. Generated code makes the file accessible as a pandas DataFrame.
  3. The data is enriched with Watson Natural Language Understanding.
  4. The data is enriched with Watson Visual Recognition.
  5. Use a dashboard to visualize the enriched data and uncover hidden insights.

Included components

Steps

Follow these steps to setup and run this code pattern. The steps are described in detail below.

  1. Clone the repo
  2. Create a new Watson Studio project
  3. Add services to the project
  4. Create the notebook in Watson Studio
  5. Add credentials
  6. Add the CSV file
  7. Run the notebook
  8. Add a dashboard to the project
  9. Analyze the results

1. Clone the repo

Clone the pixiedust-facebook-analysis repo locally. In a terminal, run the following command:

git clone https://github.com/IBM/pixiedust-facebook-analysis.git

2. Create a new Watson Studio project

3. Add services to the project

4. Create the notebook in Watson Studio

5. Add credentials

Find the notebook cell after 1.5. Add Service Credentials From IBM Cloud for Watson Services.

Set the API key and URL for each service.

add_credentials

Note: This cell is marked as a hidden_cell because it will contain sensitive credentials.

6. Add the CSV file

Add the CSV file to the notebook

Use Find and Add Data (look for the 01/00 icon) and its Files tab. From there you can click browse and add a .csv file from your computer.

add_file

Note: If you don't have your own data, you can use our example by cloning this git repo. Look in the data directory.

Insert to code

Find the notebook cell after 2.1 Load data from Object Storage. Place your cursor after # **Insert to code > Insert pandas DataFrame**. Make sure this cell is selected before inserting code.

Using the file that you added above (under the 01/00 Files tab), use the Insert to code drop-down menu. Select pandas DataFrame from the drop-down menu.

insert_to_code

Note: This cell is marked as a hidden_cell because it contains sensitive credentials.

inserted_pandas

Fix-up df variable name

The inserted code includes a generated method with credentials and then calls the generated method to set a variable with a name like df_data_1. If you do additional inserts, the method can be re-used and the variable will change (e.g. df_data_2).

Later in the notebook, we set df = df_data_1. So you might need to fix the variable name df_data_1 to match your inserted code or vice versa.

Add file credentials

We want to write the enriched file to the same container that we used above. So now we'll use the same file drop-down to insert credentials. We'll use them later when we write out the enriched CSV file.

After the df setup, there is a cell to enter the file credentials. Place your cursor after the # insert credentials for file - Change to credentials_1 line. Make sure this cell is selected before inserting credentials.

Use the CSV file's drop-down menu again. This time select Insert Credentials.

insert_file_credentials

Note: This cell is marked as a hidden_cell because it contains sensitive credentials.

Fix-up credentials variable name

The inserted code includes a dictionary with credentials assigned to a variable with a name like credentials_1. It may have a different name (e.g. credentials_2). Rename it or reassign it if needed. The notebook code assumes it will be credentials_1.

7. Run the notebook

When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.

Each code cell is selectable and is preceded by a tag in the left margin. The tag format is In [x]:. Depending on the state of the notebook, the x can be:

There are several ways to execute the code cells in your notebook:

8. Add a dashboard to the project

Add the enriched data as a project data asset

Associate the project with a Dashboard service

Load the provided dashboard.json file

9. Analyze the results

If you walk through the cells, you will see that we demonstrated how to do the following:

When you try the dashboard, you will see:

Sample output

The provided dashboard uses four tabs to show four simple charts:

The enriched data contains emotions, sentiment, entities, and keywords that were added using Natural Language Understanding to process the posts, links, and thumbnails. Combining the enrichment with the metrics from Facebook gives us a huge number of options for what we could show on the dashboard. The dashboard editor also allows you great flexibility on how you arrange your dashboard and visualize your data. The example demonstrates the following:

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ