uchicago-computation-workshop / Fall2020

Repository for the Fall 2020 Computational Social Science Workshop
13 stars 6 forks source link

10/08: Jon Clindaniel #2

Open shevajia opened 4 years ago

shevajia commented 4 years ago

Comment below with questions or thoughts about the reading for this week's workshop.

Please make your comments by Wednesday 11:59 PM, and upvote at least five of your peers' comments on Thursday prior to the workshop. You need to use 'thumbs-up' for your reactions to count towards 'top comments,' but you can use other emojis on top of the thumbs up.

nwrim commented 4 years ago

Hi Jon! I am looking forward to seeing you talk about these khipus!

I have two questions, one a bit meta and one a bit more related to khipus: 1) I know I am a bit biased on this matter, but I feel like anthropology is not a domain that loves quantitative analysis (although the dichotomy of qualitative and quantitative might not be as meaningful as it was in present days; and yes, I might be excluding those evolutionary anthropology people when I say this). Chapter 6 felt like the work of a person who is an expert in explain quantitative analysis to non-quantitative people (e.g. spending two paragraphs on taking the log - I felt like "look at fig 6.2" would have been sufficient when the crowd is more quantitative people). Is there some wisdom that you would like to share for us, whom I think will encounter similar situations from time to time?

2) In your analysis in chapter 6, you showed that Colesuyu khipus are distinct from other khipus. Your explanation toward these outliers, that these are from regions where the Inkas have less control and thus might have a different coding system based on their different economical market, raised a question for me - but what could be the advantage of having another local, idiosyncratic system? I don't really see how a knot-tying system could be more optimized for local needs - maybe this is because I heard many dialects share the same writing system albeit them using quite different phonetics (and also that if there is indeed a more efficient system the big Inka empire would have adopted it). Do you have any insights on this issue (or maybe this is not an issue at all)?

Also, for anybody who wants to know how to actually tie these khipus - Jon has a youtube channel that shows how to.

rkcatipon commented 4 years ago

Hi Jon, really looking forward to your presentation! One of my initial thoughts when reading your research was whether you found archaeology and anthropology to be receptive to the new quantitative methods you explored. Did you experience any pushback or skepticism? If so, how did you win people over? I think this question may be related to Nak Won's first question actually. Conversely, do you have any concerns with the explosion of quantitative methods in the social sciences? In chapter 6, you did a great job teasing out the nuances amongst the Khipus system and I thought this was a good example of using a cultural understanding to drive data analysis. In other circumstances, some might have been tempted to simply dismiss the outlier.

jinfei1125 commented 4 years ago

Hi! First may I ask where can I find the reading materials? Where can I find "chapter 6" as other mentioned? I didn't find it in the README file... Second I come up with a general question--Because Jon is our Computer Science with Social Sciences Application lecturer, I wonder whether you learn Python first and then find it's useful for your archaeology research, or you first have an idea to decipher the non-numerical Inka khipu signs and learn python to solve this problem? Thank you~

wanitchayap commented 4 years ago

I want to follow-up with @nwrim's 2nd question. I agree with him that it is probably more efficient to maintain the same writing system instead of many. Do you think that this differentiation mainly came from optimization (like Nak Won mentioned) or it was actually originated from pidgin/creole type of process (like the English-based creole, Hawaiian pidgin)? I think this could be an important question to ask since Khipus are so closely related to the economic system, but at the same time we could see that they are so language-like. It is probably interesting to see whether the Khipus' writing system--beyond having consistent structure, grammar, etc.--also evolved and behaved like other systems with spoken languages. How do you think we can approach this question quantitatively (if this question is valid at all)?

shevajia commented 4 years ago

Hi! First may I ask where can I find the reading materials? Where can I find "chapter 6" as other mentioned? I didn't find it in the README file... Second I come up with a general question--Because Jon is our Computer Science with Social Sciences Application lecturer, I wonder whether you learn Python first and then find it's useful for your archaeology research, or you first have an idea to decipher the non-numerical Inka khipu signs and learn python to solve this problem? Thank you~

Please check your UChicago mailbox. I sent Jon's book chapters to all MACSS students on last Friday.

SoyBison commented 4 years ago

Hi Jon, I love your work on khipus and I myself have been fascinated by them since I took a Ancient Civ of the New World class in college. My question is sort of in the abstract about how we do data science, and how we analyze something that both must have a classification system at its core, but the features of it inherently resist feature decomposition. (By which I mean we can't easily define characters, phonemes, pixels, radicals, strokes, symbols like we can with printed languages.)

Especially early on in the book you sent out, you make a number of comparisons to the decipherment of Egyptian Hieroglyphs. This got me thinking, with the benefit of retrospect, we can look at that turning point and see how, yes of course hieroglyphs must be phonetic, because practically all of the writing systems in the region are phonetic, to claim that hieroglyphs are not phonetic, abjad, etc, you would be claiming that a writing system was dominant in an area that used an "alien" syntax existed for thousands of years without influencing scripts in nearby areas. Now transpose this to South America and Khipus. Khipus, as a writing system, are much more "alien" to western languages than any of the major writing systems of the old world (at least logographs and ideograms can be printed on a computer screen, what in the heck is the unicode block for Khipus?), and many of the systems of the new world. With this long-winded setup, my question is, how do we deal with this fact? Khipus have a certain tactility that no other writing system has, this became apparent to me in your cultural patterns class last year when, as an exercise, you taught us the basis of the Khipu accounting system. When we code this into a database, how can we be certain that we aren't missing some subtleties of the system? Chirality, knot size, material, dyes, etc, are believed to have some sort of meaning on a Khipu. How do we make sure we aren't missing something due to oversight? On the other side of the coin, how do we make sure we aren't attributing signal to the noise if we overcorrect for this bias?

In short, What? How?

Thanks, Coen D. Needell

JadeBenson commented 4 years ago

I have a philosophical question: What are your thoughts about the Sapir-Whorf hypothesis?

In Chapter 3, you state: "I expect to see instances of khipu sign expression shaping a heteromaterial continuum," (37). How far do you think this type of shaping extends? How much of an influence do you think language has on our experience and understanding of the world - and how do you think this applies to the Inka?

mikepackard415 commented 4 years ago

Hi Jon, thanks for sharing your work with us!

My question has to do with the various definitions of writing systems. I appreciate the argument in chapter 1 about how writing systems need not be phonetic in nature, but only require conventionalized signs that achieve some discourse around "unique political and social uses." At the same time, I think the reason some scholars may emphasize a phonetic basis is because of the vast number of signs (words) available. The complex combinations and structures these enable in human languages create an "open ended-ness" we find in what we typically think of as a writing system. Khipus, even if they do show a low type/token ratio, have a relatively small number of possible signs. Do you see this a a limiting factor for the ultimate usefulness of a writing system based on khipus?

Thank you! Mike

ginxzheng commented 4 years ago

I was curious about the outliers as well! In your fascinating work you added spatial data to illustrate the variations. I was wondering, if other facets of data available, would it be possible to distinguish any further exceptions and explain it with qualitative background? Will you first suspect any variations if applying some other variables, such as ethic groups, time series, etc., from qualitative investigations, and then to develop data testing them out? Or the steps are in reverse? Thank you!

anqi-hu commented 4 years ago

Hi Jon, thanks for sharing your work with us! It is fascinating to see how much is being explored towards a more comprehensive decipherment of the Khipus system. At the same time, as you have stated, the work is mostly conjectural as there is hardly any way of substantiating the interpretations. My question is, if more of the records are unearthed and added to the data that you are working with, how much do you see these new data points as capable of shifting the original hypotheses/ theoretical propositions regarding the system? Thank you!

luckycindyyx commented 4 years ago

Hi Jon, thank you so much for introducing such fascinating work to us. I am particularly interested in chapter 7 "Outlining A Grammar", and my questions are: What are the computational methods used most frequently when exploring a grammar pattern of a new language? And if a grammar pattern has been found, will you exploit machine learning to "decipher" other materials?

TwoCentimetre commented 4 years ago

I am curious about what the first step to decipher an unknown sign system is. And if we finally come up with some interpretations of this unknown system, would these interpretations be falsifiable, especially for those systems used by ancient people? Since this system is totally unfamiliar to us, it can be meaningful in totally different dimensions. So what the start point should be? Does that totally depend on the researchers' experience and luck? And if the people who use this system do not exist, how can we know if we do it right?

yierrr commented 4 years ago

Hi! Just curious, how were other ancient language systems (like ancient Egyptian) deciphered? How is Inka khipus different from these ancient languages that have been deciphered, so that it needs big-picture computational analysis to help with the deciphering process? Thanks!

wu-yt commented 4 years ago

Thank you so much for this interesting research! As computational analysis using algorithms and big data often have some unanticipated side effects, I’m curious whether you encountered any of these issues and how did you solve them?

boyafu commented 4 years ago

Hi Jon, thanks for sharing! I am fascinated by the idea of utilizing computational tools for interpreting Inka khipu signs. The logic of programming language is helpful to solve the rules of using and combining signs. I was wondering if the Inka Writing System would have some varieties within the language, which might be quite similar yet slightly different from the rigorous rules of the writing system? If that is the case, could computational methods identify the similarities and varieties within the language system? Thanks!

minminfly68 commented 4 years ago

Thanks for this interesting presentation. Highly agree with Nak Won's point: How to address it towards the classic audience is one of the biggest challenge in some traditional social science discipline? If you could share some ideas with us, we would be fully appreciated. Also, is there any trend that computational method would dig more into social science theory etc.? Look forward to the presentation :)

Raychanan commented 4 years ago

We have used technology to enable the recognition of text or speech. But how can we verify that these guesses of ours are correct? Isn't it possible that none of these speculations about past words or speech, obtained on the basis of technical means, are completely wrong? Or maybe we've actually gotten even the most basic strokes wrong? And it is impossible for someone from the past to tell us whether the guesses we make are correct or not. So, I would like to ask, how do we verify that our guesses about ancient texts or speech are correct?

MengChenC commented 4 years ago

Hi Professor Clindaniel, this is a really appealing research target, thank you for sharing. You mention there are two approaches to decompose the Inka writing system: knot/cord-level of analysis and khipu-level analysis, how do you integrate them to construct a full appearance of the Inka writing? Besides, would you also be able to discuss more regarding the possibility of the combination of multidisciplinary work to decipher the Inka khipu signs? Thank you.

sabinahartnett commented 4 years ago

Thanks for sharing your work Professor Clindaniel! In a similar vein to @Raychanan 's question, did you try to replicate any of the research considered as 'foundational' in studying Inkan traditions (esp. any of Hylands research) using computational methods? I would be interested to see how that would compare to the conclusions found by more traditional methods. I am also wondering about how you divided the collected information into training sets and testing sets (since this is not a notation in use today, you cannot train, predict and evaluate the predictions in real time)? -Sabina

luxin-tian commented 4 years ago

Thank you for sharing your work. It is interesting to see a combination of computation methods with anthropology research. I am an outlier in this field, but I am curious that as most of previous studies in this field seldom engage computation methods, what's new can computation bring to this field if researchers re-examine previous findings under a quantitative perspective? Will pattern recognition techniques help figure out more findings?

YuxinNg commented 4 years ago

Hi Jon, thanks for sharing! I am always curious about how ancient language systems got deciphered. Now that we have computational method as our strong tool, I am wondering if there is any previous case that computational method being applied to help with the decipher process?

mintaow commented 4 years ago

Hi Jon! Thanks for sharing your research, I really enjoy this week's reading, particularly the well-explained experiment design in identifying the relationship between Khipu magnitude (maximum pendant cord value) and color patterns signs on khipus (Chapter 6.2).

I am curious about how you deal with the endogenous problems when deriving the above mentioned causal relationship between magnitude and color patterns. As shown below, the result is very promising, but I am not very sure if it is possible to be caused by some hidden factors simultaneously affecting both the color banding/seriation and the Khipu magnitude? image

I am also quite concerned about the sample size issues: when controlling spatial variables in the above regression model, our sample size shrinks from around 269 to 130, thus I am not sure if the log-transformed x and y still follow the normal distribution assumption. How we should address these concerns? image

Thanks! Mintao

j2401 commented 4 years ago

The problem that troubles me most throughout my reading is that how can we deal with sample that we have already recognized biased. In section 3.2 I noticed that OKR dataset mostly consists of Khipus from a certain area. Though OKR is a relatively large collection of Khipus as argued in the book, it is still helpful to justify its representativity. Further, in the assessment of geographic scope of code use(section 3.4 and 6.2), with the assumption that the location where the Khipus was found reveals where it previously produced, it makes me feel even more necessary to justify the bias in the sample before doing more Stat analysis. It would be much appreciated if you could share more with us on how you deal with problem.

Moreover, just like many others mention, it is interesting to think of how we convince ourselves that our understanding is correct after making the best guesses based on evidence we have. Some inferences are more like a known-plaintext attack, but there are more which we are not so confident. Many thanks for sharing more with us tomorrow!

hesongrun commented 4 years ago

Thanks for the presentation. The research is very cool! I am very curious about the use of computation techniques to decipher Khipus. This feels like unsupervised learning. How do we evaluate the effectiveness of the decoding? How do we combine existing archeological evidence with computation? Thanks!

linghui-wu commented 4 years ago

This research is so enlightening that employs computational and quantitative techniques in anthropology! As already mentioned by @nwrim and @minminfly68, I would very much like to know what do you think would be the most valuable takeaway in this paper for students in other social sciences disciplines? Looking forward to your presentation!

YileC928 commented 4 years ago

Hi Jon! Thanks for bringing us this wonderful sharing session! I'm curious about your research journey - how you found this interesting topic and how you decided on involving computational methodologies in it. I know this is a pretty general question, but as mentioned by @linghui-wu above, I also really want to learn whether there are any implications and suggestions you could provide for students who are currently new to CSS and are finding ways to start their own research. Looking forwards to the sharing!

timqzhang commented 4 years ago

Hi Jon, this paper is quite insightful for an outliner of this field like me. I'm wondering to what degree the contemporary technology could help model and uncover the "mysterious" algorithms in ancient time, just like this paper does. Is it also beneficial to our model tech, as sometimes the ancient methods are quite unexpected?

a-bosko commented 4 years ago

Hi Jon! Thank you for your research, it is a very interesting topic that helps us understand more about history and culture. As data becomes more and more accessible digitally, do you believe that we will be able to have more breakthroughs on undeciphered languages and written systems in the future? With advances in machine learning, is it possible that we will be able have computers quickly analyze any historical writing system, or will we always need people to help understand the meanings of old writings and hieroglyphs?

Thank you, Angie

bowen-w-zheng commented 4 years ago

Hi Prof. Jon Clindaniel, thanks for sharing your work. I have two questions, one about the analysis in Chapter 6 and one more general.

  1. Just curious, but why reduced the longitude and latitude into one dimension when you did the analysis with spatial data? Since the model is pretty low-dimensional, I thought reducing two variables into one would not help with the convergence rate much and it might lead to difficulty in interpreting the results. Is this a convention in dealing with spatial data? If this 'provenance' factor is found to be significant, how would you interpret the result?
  2. Another question about research in general. For anthropologists, is it difficult to come up with testable hypotheses before seeing what data are available to work with? If this is the case, how can one avoid coming up with hypotheses that overfit the data (i.e. inadvertently doing multiple testing without correction)?
chentian418 commented 4 years ago

Hi Jon! Thank you for sharing the interesting research, and I am impressed about how computational methods such as econometrics are applied in anthropological fields! While quantitative or data science methods could boost research ideas by implementing the advanced technologies to collect, process and analyze datasets, I was wondering how could we justify the relationship between the models or methods we employ and the story we tell, e.g., in the case of investigating color banding and seriation for Khipu-level analysis? In other words, how do we know the conclusion is actually clarified by exactly the models, or it might be a coincidence that the computational models along with the datasets fit the story and there would be other explanations with the development of more advanced quantitative models and more comprehensive datasets? Thanks!

skanthan95 commented 4 years ago

Hi Jon! Looking forward to your talk tomorrow. Here are some of my questions:

(1) Have you discovered regional 'dialects' within the Inka Khipu, across the Inka empire? Given that it's difficult to make granular inferences about what the knots represent, how can these dialects be identified or differentiated from a substantive semantic difference or noise?

(2) I noticed that you used Python for a lot of your computational analysis. What was your rationale for using this programming language over others?

(3) I'm curious about the relationship between fluency in Quechua and ability to decipher the Inka khipus, and would love to hear more about how Quechuan syntax informs khipu knot structure, color, etc.

NikkiTing commented 4 years ago

As many have already said, the research is very interesting even for those of us who have are not in the same field. I wanted to ask what might be a good starting point of thinking of how to use computational methods in understanding other pre-colonial languages that are almost nearly gone? Also, in a wide group of people colonized by the same nation/ empire, would it be meaningful/ helpful to look through similarities in colonial influences in the languages used by the people to decipher languages with less pre-conquest data? Thank you!

zixu12 commented 4 years ago

Hi Prof. Jon, thanks for sharing your work. I am wondering whether anthropologists use the typologic classification of language (which are structural classifications including isolating, agglutinating, and inflecting) to decipher languages. If yes, when and how they will be used, especially related to computational methods. Intuitively, I think this classification will be useful, but just not sure. Thank you!

harryx113 commented 4 years ago

Hi Jon, thanks for sharing your work with us and for your dedication to this particular topic. It seems that archeology has limited computational researchers because computers were not existent a century ago and most data was not stored digitally. That said, what do you think could be done today to expedite the digital transformation for Inca khipu or archeology in general?

Bin-ary-Li commented 4 years ago

Hi Jon. First of all, very nice work. Love to see social scientists implement more computational methods in their work.

I do have some comments about the analysis process. Is there reason why you are not doing a full GLM model with:

log-odd = beta0+beta1(Magnitude)+beta2(Provenance)+beta3*(DistfromCuzco),

and do nested model comparison?

Also have you used any resampling technique on the statistic or cross-validated the model? I would be interested to see what the result will be like.

-Bin Li

Dxu1 commented 4 years ago

Hi Jon, thank you very much for sharing your interesting work. This is the first article I read about anthropology (and with computation). I have the three brief thoughts/questions:

  1. The log transformation on maximum pendant cord is just clean. :))
  2. You have convinced me of a systemic relationship between color pattern and aqggregation level Inka khipus. To me, it suggests that color patterns is a consistent sign system across Inka groups. However, I am curious on if it is possible that different groups use the same sign system for different meanings. If so, could we identify through the data using other variables (maybe more on the pattern recognition, instead of maximum pendant cord)?
  3. A more board question regarding language: how do you think computation and data science could work with digitalization of language to a) "re-learn" language that might have been forgotten already (but maybe with records) and b) preserve language that are "dying"?
bjcliang-uchi commented 4 years ago

Thank you for this exciting paper. I do have questions about dimensionality reduction, but it seems that my cohorts have already asked these questions. I am also wondering about how this "language" interpretation model copes with the variation of the people who used the language. That is, the symbolic meaning of the same expression changes overtime. For example, when I was trying to interpret an ancient Samaritan amulet as an undergraduate, a big issue that I was struggling to deal with was the constant confluence of other expression patterns from surrounding cultures.

k-partha commented 4 years ago

Fascinating read. I have a couple of questions: Do what extent do you think learning algorithms in conjunction with NLP can help in recovering "lost" languages and symbolic representation systems? On a more practical note, could you mention some python packages/ tools you think are setting the industry standard for advanced NLP work?

Qlei23 commented 4 years ago

Hi Prof. Jon! Thank you for sharing your work with us. From my perspective, computational analysis can be applied to this problem because there're several numeric attributes such as value and length (and other categorical variables) of the research subjects. I wonder if there're other situations where computational methods (for example, clustering?) can contribute to the field of anthropology.

qishenfu1 commented 4 years ago

Hi Jon, thank you very much for sharing your work with us! This is my first time encountering topics in computational anthropology and semiotic practices. I have a brief question: in Chapter 3, you mentioned that "neither codes nor legisigns need to be widely shared between people in a community; a single individual can utilize a personal convention or a personal code". Because of this inconsistency, a lot of interpreters were needed in the Near East during the second half of the second millennium. I am curious that given the non-unity of "language" among people, how did they communicate and spread their thoughts and culture? Moreover, the part I think is most interesting is "interpreting the replication of color pattern signs", especially the argument that color banding and seriation were symbolic signs offering information about itself in a propositional fashion. This provides me with a new aspect of understanding color symbols.

siruizhou commented 4 years ago

Thanks for sharing. I'm also new to this and I'm impressed by the way ancient mystery become organized thanks to digitization. I am curious if this archeological deciphering work has any connection with modern cryptography? Do you share any common methods?

Yilun0221 commented 4 years ago

Hi Jon! I am really excited to read your paper talking about the combination of archaeology and computational methods. But I think the accuracy of the data is largely based on the current technologies. I am confused that if we can not identify the original meaning of the figures, how to measure the accuracy of the reasearch?

MkramerPsych commented 4 years ago

Professor Clindaniel,

Looking forward to your presentation!

My questions are as follows:

(1) As I am unfamiliar with computational linguistics, I am curious as to your theoretical approach: The khipu appears to have a great deal in common with a computational language. Does the argument for khipu as a language draw from the computational utility provided by the language? I see analogs to elementary programming languages.

(2) When discussing the conventionalization of khipu use, I am curious if you view a hierarchical approach when using modeling techniques to answer questions about khipu use? I often see models that examine both individual and group data, and I wonder if the different utilizations of khipu alongside more general understanding can improve understanding.

I am also curious as to the possible application of cryptographic analysis to the khipu language.

yutianlai commented 4 years ago

Hi Jon! Thank you so much for sharing your work with us! I'm wondering about the next step you would take. How would you like to further explore this topic?

heathercchen commented 4 years ago

Hi Jon! Welcome back to the workshop! I am wondering how these numerical values in Inka languages are deciphered at first. (Maybe it is shown somewhere in the former chapters). I remembered that for the decode of Egyptian hieroglyph, we found a comparison between this unknown language and something that we already know. But how did the Inka numbers be deciphered while many of the other parts of the language remained not known?

ydeng117 commented 4 years ago

Hi Jon! Thank you for sharing such interesting research. I am curious about how did you manage the temporality of the data. Language or the system for recording may evolve from time to time due to many reasons. I saw you have addressed the geographical aspect in Chapter 6, so I was wondering how would your research handle the temporal one. Did your data come from a specific era, in which the writing habit would not change at all?

chrismaurice0 commented 4 years ago

Hi Jon,

Excited to discuss this research more in-depth tomorrow. One of your main findings was that the Inka Khipus show Colesuyu was under an indirect, more independent relationship with the rest of the Inka empire (or Cuzco). Further, you state that Colesuyu could have very well "spawned their own local codes". Does this finding of your help other researchers in their own work answer unanswered questions about differences in characteristics of Colesuyu from the rest of the Inka Empire or relationships to the Inka Empire in general?

romanticmonkey commented 4 years ago

Hi Jon! Through your research into the Inka Writing System, have found or speculated evidence that humans has an innate propensity to compute, memorize, or communicate in a certain way. Although this question leans on the psych/neuro side, I want to hear about how you interpret and answer this question with Inka archaeology expertise. Thanks!

xxicheng commented 4 years ago

Hi Jon,

Thanks for sharing your work with us. I have a similar question as Nak Won Rim, that is I also think anthropology seems a discipline in social science that does not welcome quantitative analysis. I am wondering your opinions on this question. Thank you, and looking forward to your presentation tomorrow :)

-Xi

hhx2207061197 commented 4 years ago

Hi Jon! Thank you for sharing such interesting research. I actually care about how you think about the deficiencies of the present research design and how you want to improve it in the future.