Open mylinhthibodeau opened 7 years ago
Peer Review:
Hi, My Linh! As you explained, I looked mainly at activity5 in the data reshaping part and activity2 in the join part. In fact, I also looked at your long version homework roughly. I think you did a really well job to explore your own dataset and it is cool to take advantage of what you have learned in class to solve your research problems.
As for activeity5, you matched different tasks with tidyr/dplyr
function reshape2
function and base R
operations separately. You used your own dataset, and you also called read.table()
to open the local files, which is very useful.
Indeed, as you've concluded, I think most of the time, tidyr is good for data preprocessing, such as data cleaning and sorting. It can meet most of the tasks in your classification. Reshape2 is mainly for long data_frame and wide data_frame operations, including melt
and *cast
functions. Both tidyr and reshape2 can make it easier for us to use ggplot2 to plot.
Created your own cheatsheet on join functions in activity2, and used different kind of join function including mutating_join and filtering_join.
Each step has specific explanations and annotatins. The only suggestion I would give is that you can make it better by doing more proofreading, because some parts were not displayed well in your .md
file.
Overall, it looks like you have put significant time and effort into this assignment. Great work!
Regards, Jiahui Tang
Dear @Tangjiahui26,
Thank you so much for your feedback, I greatly appreciate it !
I found myself having trouble with the RMarkdown formatting: I type 2 empty spaces after the end of a line to make sure that RMarkdown will understand that it needs to skip a line. When I used knitr on my personal computer, the formatting was perfect, as you can see HERE, but when I pushed the file to github, somehow, the spaces were lost, the titles scrammed together and the formatting messed up!
options(knitr.table.format = "markdown")
Thank you for your time, Warm regards My linh
Peer Review:
Hi, @mylinhthibodeau ! You did an excellent homework and went above and beyond the requirements of the assignment. In fact, you only need to pick one of the data reshaping prompt and a join prompt, but you have gone ahead by doing all the activities/prompts, which is worthy of praise. It was also very apt to see you use datasets pertaining to your research work and this motivates people like me.
Data manipulations in R
Data joining
I also checked your long version and it shows the kind of effort that you had put into this assignment. You had explored lot more than that was asked.
Some suggestions
It was great to see that you had put the links for the files in the issue as it makes easy to access the file. The fact that you had mentioned the struggles that you had experienced and the solutions that you followed is very helpful and useful for other students.It was also nice to see you mention the new things that you learned through this assignment in a clear and elaborate manner. I hope to follow this for upcoming assignments.Thanks for that.
Overall, I think your homework was really well done and hope you can keep it up!
Regards, Arun Rajendran
Hi @mylinhthibodeau here are some comments about your homework General data reshaping and relationship to aggregation: Yes Join, merge, look up: Yes Progress report: Yes Extras (optional; merge/match): Yes
Same comments, perfect explanation of what you are planning to do or expect with each function,
I appreciate your explanations when encountered an error, in your case you found that the function was complaining because of the duplicity of gene names, then you filtered and solve the problem.
Simple things like the format of your document could be a pain, I would recommend you maybe to check which one is more common in your area and use that,
To round off your perfect work I would add the name of the variable/cancer_type (in this case) that you are mentioning, for example: Observation: We can see which cancer type (variable) has the highest or lowest mean.cancer gene expression for individual genes. So I assume that is the cancer type A2M.
I think that the most valuable part of your work is that you are using data that you use, the coding could be better because there are many ways to do the same but I think that your work is flawless!
Your marks will be distributed later,
Regards,
Pedro G
Dear @abishekarun and @Tangjiahui26,
I unfortunately didn't realize that we only needed to pick two activities in total, and that's why my original Homework 4 (now entitled "long-version-stat545-hw04-thibodeau-mylinh") file is NOT THE ONE TO REVIEW.
For the MARKING/PEER REVIEW FILES, you can limit yourself to the README file HERE and these two cheatsheets:
The general homework 4 repository is here
Thank you for your time, and if you have any questions, please don't hesitate to let me know, Warm regards My Linh