Reading

Pro tip: try using an app on your phone or computer to read aloud to you at 1.5x speed! This can save time and make it easier to absorb information while not being tied down to a computer or device visually.

[x] Review the bug report we submitted: https://github.com/duckdb/duckdb/issues/7015 and the fix to avoid the bug we ran into for analyzing Zulip data: https://github.com/onefact/datathinking.org-codespace/blob/main/notebooks/in-class-notebooks/230420-zulip-chat-duckdb-loading.ipynb
[x] Review several editions of Style, Clarity and Grace (https://bookshop.org/p/books/style-lessons-in-clarity-and-grace-joseph-williams/8980182 - Libgen)
[x] [climate harms] Read https://arxiv.org/ftp/arxiv/papers/2104/2104.10350.pdf
[x] [global context thanks to @indrekromet] Read https://www.palladiummag.com/2023/02/23/the-west-lives-on-in-the-talibans-afghanistan/
Preparation for guest lecture
- [x] Read https://arxiv.org/abs/2302.10329
- [x] https://www.economist.com/science-and-technology/2023/04/05/it-doesnt-take-much-to-make-machine-learning-algorithms-go-awry (https://archive.ph/5l1k3#selection-1038.0-1038.1)
[x] [David Foster Wallace on Consumption] https://harpers.org/wp-content/uploads/2008/09/HarpersMagazine-1996-01-0007859.pdf
[x] [Cross-Cultural Research and Ethnography] Read excerpt from https://www.thefieldstudyhandbook.com/ - https://drive.google.com/file/d/1F9zr1pHesGPSOTtXLQu4hTxOGu7zoIRx/view?usp=sharing
[x] [Status games, social engineering] Read the chapter on status games in Impro (https://bookshop.org/p/books/impro-improvisation-and-the-theatre-keith-johnstone/11721204 or Libgen)
[x] [Closure on the data journey begun in the first class on COVID] https://www.wsj.com/articles/covid-origin-china-lab-leak-807b7b0a
[x] [History of Computing and Data Thinking] Read the Epilogue of Computer Age Statistical Inference by Hastie et al., https://hastie.su.domains/CASI_files/PDF/casi.pdf
[x] [Basics of GitHub] https://github.com/onefact/github-starter-course
Systems Thinking and Incentives:
- [x] [Context on how big 4 consulting firms may/may not be replaced by AI] https://www.lrb.co.uk/the-paper/v44/n24/laleh-khalili/in-clover (https://archive.ph/yC1YQ)
- [x] [High finance & geopolitics] https://www.theguardian.com/world/2016/jul/28/1mdb-inside-story-worlds-biggest-financial-scandal-malaysia
- [x] [High finance & geopolitics] https://www.bloomberg.com/news/articles/2014-08-21/mugabes-bailout-och-ziff-investment-linked-to-zimbabwe-despot#xj4y7vzkg?leadSource=uverify%20wall
- [x] [Predatory Investing] https://www.newyorker.com/magazine/2018/08/27/paul-singer-doomsday-investor
- [x] [Global Geopolitics of AI] Chapter 1 - China's Sputnik Moment - and Chapter 4 - A Tale of Two Countries - of Kai-Fu Lee's AI Superpowers (https://bookshop.org/p/books/ai-superpowers-china-silicon-valley-and-the-new-world-order-kai-fu-lee/6960836 or Libgen)

Doing

[x] Edit https://en.wikipedia.org/wiki/Data_thinking using the learner definitions of data thinking
[x] Try playing one round of https://lost-at-sql.therobinlord.com/
[x] Install https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff and read the documentation, and activate the extension for python files opened in Visual Studio Code: https://beta.ruff.rs/docs/tutorial/
[x] Red-teaming Previous Homework:
- [x] [Set timer: 10 minutes maximum] Before asking GPT (to avoid biasing yourself!), write your own critique of your homework. Questions to consider could be: what could be improved? What doesn’t make sense in the visualization? What doesn’t make sense in the writing?
- [x] [Set timer: 10 minutes maximum] Ask GPT-4 to critique the homework or the visual using your favorite data thinking definition we have so far.
- [x] [Set timer: 10 minutes maximum] Add this critique as a comment on the homework’s github issue, and link to the critique in Zulip.
- [x] [Set timer: 10 minutes maximum] Repeat this exercise for the previous homework of one other person in the class

Creating

[x] Generate art or diagrams or logos using https://beta.dreamstudio.ai/generate and post on Zulip chat
[x] Use https://poly.cam/ to scan an object or location of your choosing; share the link on Zulip (bonus: feel free to try BlenderGPT: https://github.com/gd3kr/BlenderGPT to modify the 3D scan)
[ ] Use duckdb to load the Zulip data into a SQL database, and use altair to visualize the data, following https://github.com/onefact/datathinking.org-codespace/blob/main/notebooks/in-class-notebooks/230420-debugging-duckdb-altair-falcon-3-1-1-service-requests.ipynb (run this notebook with this data: https://data.cityofnewyork.us/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9 - and try changing the data source to be the Zulip data and post a visualization of the chat data on Zulip)

Thinking

[x] In a comment on this issue, propose a research question that you will aim to answer the upcoming final homework (and cross-post it on Zulip for feedback); include a rationale for why this topic matters to you, and decisions you need to make, and a clear articulation of the stakes. This can be a paragraph or several. Perhaps start with your experience report from the first homework.

Listening

[x] Listen to this podcast with Emily Bender: https://open.spotify.com/episode/2ILGlkAXAt4xfuKHwIV2on

Large Language Model Access Checklist

[x] Try GPT-4 at https://chat.openai.com/chat (ask on Zulip if you need access)
[x] Try https://serge.chat/
[x] Try https://bard.google.com
[x] Try https://console.anthropic.com/claude (ask on Zulip if you have not received an invite)
[x] Try https://www.phind.com/
[x] Try Copilot (https://github.com/features/copilot; free with GitHub Education Pack)
[x] Try Bing Chat (https://bing.com/chat)
[x] Try Huggingface Chat (hf.co/chat)

My own critique (summarized from the report using ChatGPT):

Linear Regression for Message Length Prediction: The linear regression task partially fails due to non-linearities in the data distribution. The model is unable to capture these non-linearities, resulting in incorrect predictions, including negative message lengths and failure to predict very long outlier messages.

Logistic Regression for Message Subject Prediction: The logistic regression task succeeds almost perfectly, but the high input dimensionalities (96 dimensions) in relation to the number of samples (around 200) suggest potential overfitting. The model predicts message subjects accurately, with the exception of some confusion between the "main" topic and other topics. Lower accuracy is observed for topics with few samples, indicating a lack of representation in the dataset.

Dimensionality Reduction and Visualization: A 2D PCA projection of the embeddings reveals that the subject "linkedin profile" is distinctly separated from other subjects, indicating its unique characteristics. However, it is suggested that this topic should be removed from the dataset as it distorts the distribution of other samples.

Critique using ChatGPT

I inputted the datathinking wikipedia article preceded by the prompt "Memorize this definition of Datathinking. I will ask you questions about it later.". Then I inputted a text only version (aquired using pdftotext) preceded by the prompt "Critize the following report using the definition of Datathinking". The results are below. View the conversation here

The report on Zulip Chat Analysis can be critiqued using the definition of Data Thinking as follows:

Lack of Emphasis on Data Thinking Framework: The report provides a detailed analysis of the Zulip chat data, including data cleaning, computing embeddings, and applying regression models. However, it does not explicitly mention the application of a data thinking framework or the integration of computational thinking, statistical thinking, and domain thinking.

Limited Discussion on User Experience: The definition of data thinking highlights the importance of user experience in the design of data-driven solutions. The report does not extensively discuss user experience or how it was considered in the analysis process. Considering user needs and incorporating design thinking principles should be an integral part of data thinking.

Limited Exploration of Data Collection Improvement: Data thinking involves questioning the quality of data and considering how data collection can be improved. The report briefly mentions data cleaning but does not thoroughly discuss the process of data collection or potential improvements in data collection methods.

Lack of Iterative Monitoring and Data Collection: Data thinking emphasizes that improving models and solutions is a continuous process that includes iterative monitoring and data collection. The report does not mention iterative monitoring or how ongoing data collection was incorporated into the analysis process.

Inadequate Consideration of Generalization and Bias: Data thinking acknowledges that problem abstraction is domain-dependent, and generalization is subject to biases and variance in the data. The report does not explicitly address these considerations or discuss potential biases or limitations in the analysis.

Limited Discussion on Strategic Context and Risks: Data thinking involves clarifying the strategic context and identifying data-driven risks and opportunities. The report lacks a comprehensive analysis of the strategic context and does not discuss the risks and opportunities associated with the Zulip chat data analysis.

Insufficient Evaluation of Feasibility and Profitability: Data thinking involves measuring business impact and assessing the feasibility and profitability of data-driven solutions. While the report briefly mentions the success of logistic regression and the profitability of the developed solution, it lacks a comprehensive evaluation of feasibility and cost-benefit analysis.

Overall, the report demonstrates some aspects of data analysis but falls short in fully embracing the principles and components of data thinking as defined in the provided definition.

I agree with the model that I didn't place emphasis on datathinking thoughtflows, but performed a conventional analysis including my own critique.

CRAW

A Compound Reasoning and Referencing Agent on Wikipedia using Langchain

A project by Benjamin Eckhardt (benjamin.eckhardt@stud.unni-goettingen.de)

for the Spring 2023 "Datathinking in the Era of ChatGPT" lecture by Jaan Altosaar @ University of Tartu

Editors: GPT-4, Copilot

code repo link github.com/beijn/project-craw

To get started save an OpenAI API key at secret/openai_api_key, run pip install -r requirements.txt (optionally in a python -m venv venv; source venv/bin/activate) and run python main.py. There is currently no GUI. The progess will be printed to the console and the final result is a recursive tree of questions and answers, reasoning and sources.

Overview

Language models are powerful but they are not good at reasoning nor referencing.

The core idea is that complex reasoning tasks involve nested trees of propositions and partial conclusions. Research has shown that decomposition of reasoning tasks into a 'chain of thoughts' dramatically improves the 'reasoning capabilities' of language models. And generalizing to nested 'trees of thought' improves even more dramatically (2305.10601).

Language model 'reasoning' is largely opaque and factual accuarcy questionable, as they usually do not provide references.

This project aims to build a reasoning agent that uses language models in different modular subtasks to answer questions by recursively answering subquestions. We aim to solve the intransparent 'reasoning' problem by decomposing as task into disjoint subtasks and an explanation of th reasoning behind that. Atomic subquestions are answered by connectecting a compound inference engine to Wikipedia. By collecting (only the) relevant interpretable source citations, we

The overall algorithm skelleton is hardcoded in a modular, funcionally pure way. The resulting recursive tree of questions with answers, explanations and eventually references is created elegantly as the mirror image of the algorithms pure recursion. For replacing the static skeleton with a dynamic agent see Future directions

All interactions with LLMs work through modular, declarative and descriptive as single-purpose functions. See LLM Interfacing. Therefore the division in Reasoning and Referencing is historic and aestetic.

Reasoning Agent

Question Decomposition - Given a question decides whether the question is atomic or compound. An atomic question will be answered by the Referencing Agent. A compound question will be decomposed into a set of subquestions and a natural language reasoning statements of how to combine answers to the subquestions into an answer of the given question.
Answer Recombination - Given the question, the decomposition reasoning and the recursively computed answers, compile and answer involving only reasoning and no halucinated knowledge.

Referencing Agent

Uses multiple subsystems to find the answer to an atomic question inside a database (Wikipedia) including citations and source link.

I didn't use langchains convenient wikipedia api because it didn't return the full page content. Instead I collect documents manually with the standalone wikipedia api.

Search Generator - Takes an atomic question and generates search queries for wikipedia, to maximize the chance of finding the answer in the first few results.
Abstract Pre-Selector - Compute vector embeddings of the abstracts of the wikipedia search results and selects what is semantically related to the question.
Content Reference Selector - Computes vector embeddings of chunks of content of the related wikipedia articles and what is semantically related to the question.
Atomic Answerer - Answers the question with each related content chunk until it finds an answer.

By decomposing the task in this way we can

search efficiently - first based on abstracts to remove false search hits before we load and embed only the relevant article contexts
keep references - By starting of with articles we can keep track of the source article of the answer. By answering the queston with each related content chunk independently we can keep track of the source citation.

[...]

onefact / datathinking.org

[homework: asking, writing, thinking, doing, 🔴 red-teaming 😈] Critique what you have built; Real-World Data; Historiography of Data, Incentives, & AI #238