onefact / datathinking.org

Data Thinking website deployed using GitHub Pages
https://datathinking.org
Apache License 2.0
7 stars 7 forks source link

[homework: asking, writing, thinking, doing, 🔴 red-teaming 😈] Critique what you have built; Real-World Data; Historiography of Data, Incentives, & AI #157

Closed 0rd0s1n1ster closed 1 year ago

0rd0s1n1ster commented 1 year ago

Reading

Pro tip: try using an app on your phone or computer to read aloud to you at 1.5x speed! This can save time and make it easier to absorb information while not being tied down to a computer or device visually.

Doing

Creating

Thinking

Listening

Large Language Model Access Checklist

0rd0s1n1ster commented 1 year ago

The research question I want to focus on is linked to Chat-GPT detection. The availability of such convenient tools makes people lazy, and abuse comes in hand with it(even though, at the moment of this comment chat gpt is not available due to overload). My idea is to take theses of Science and Technology curriculum as a human written source since I believe there is something in common in the psychosphere of me and other graduated students from my curriculum. Then I will cut text into chunks of 150 words and send to chat-GPTwith the request to rephrase them. Based on the responses collected I am planning to train DistillBert model to classify text to be written by human(Sci&Tech student) or Chat-GPT. The all-in-all budget is expected to be <5$. The model choice is constrained by the size I can load into my GPU and amount of time I am willing to wait until is ready, for that the lighter model is used.

0rd0s1n1ster commented 1 year ago

Here I have also visualized using altair and loaded with duckdb. Figure 1. The number of messages that have attachments and out of them which have images. It is interesting to see that some messages without attachments have image attached, which I find weird: image

Figure 2.. Number of messages per sender: image