Open Sicheng2000 opened 7 months ago
@Sicheng2000 Great self-assessment. It is very detailed and demonstrates that you have really put some thought into this lab.
I hope that you can resolve the issue that you are having pushing this repo to GitHub as I would really like to take a look at your code and make specific comments and suggestions there.
Let me know if you need more help on this. If need be, set up an appointment with me.
readLines()
function and thestringr
package with regular expressions to extract the desired words from my translations in Chinese and the source text in English. However, I later discovered that my files were in Word document format, requiring the use of thereadtext()
function. Additionally, I realized that adjustments were necessary to ensure the computer could effectively read my Chinese translations. Furthermore, extracting variables from implicit plain text created an additional challenge. In the case of semi-structured data, separating annotations from the text can be particularly challenging. If the pattern I set does not match everything, it becomes difficult to verify the accuracy of the separation. Checking line by line can be time-consuming, especially with a large number of observations. When dealing with structured data, identifying redundant columns or columns that do not provide useful information is crucial. Reorganizing the data in a clearer manner is essential for improving its usability.2_curate_key.qmd
. However, upon completing my assignment, I encountered difficulties pushing my revised version to the GitHub repository. After consulting resources such as https://stackoverflow.com/questions/20370294/git-could-not-resolve-host-github-com-error-while-cloning-remote-repository-in, I tried many ways including unsettinghttps.proxy
, setting my remote URL origin, and changing my access token. Unfortunately, none of these attempts were successful. As a result, I am seeking further guidance from Professor @francojc.readLines
function can be used to obtain parallel text, but based on my own experience, additional steps are needed to obtain"real" parallel text.