DS4PS / cpp-527-spr-2020

Course shell for CPP 527 Foundations of Data Science II for Spring 2020.
http://ds4ps.org/cpp-527-spr-2020/
0 stars 1 forks source link

Report Template Assignment #13

Open sunaynagoel opened 4 years ago

sunaynagoel commented 4 years ago

@lecy When I knit my .rmd file (index.rmd and resume.rmd). I get the following error. I tried to look up this error but could not solve the problem.

pandoc: Cannot decode byte '\xfc': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream Error: pandoc document conversion failed with error 1 Execution halted

Thanks ~Nina

sunaynagoel commented 4 years ago

@lecy After changing save .rmd file with encoding from default UTF-8 to ASCII, i was able to knit the file.

meliapetersen commented 4 years ago

@lecy After changing save .rmd file with encoding from default UTF-8 to ASCII, i was able to knit the file.

Hi Nayna, I am getting the same issue. Can you explain a little further what you did? I don't understand what you mean when you said you changed the "save .rmd file with encoding from default UTF-8 to ASCII". Thank you!!!

JaesaR commented 4 years ago

Whenever I knit my resume.rmd, the skill bars on the aside don't show up, and the color of the font in that section becomes white. This is my code:

skills <- tribble(
  ~skill,               ~level,
  "Microsoft Excel",     5,
  "Microsoft PowerPoint",4.5,
  "Tableau",             4,
  "R",                   3.5,
  "Microsoft Access",    3,
  "SQL",                 3

)
build_skill_bars(skills)

I tried to look up how to build the skill-bars, and added the following but I am still getting the same issue:

# Construct a bar chart of skills
build_skill_bars <- function(skills, out_of = 5){
  bar_color <- "#969696"
  bar_background <- "#d9d9d9"
  skills %>% 
    mutate(width_percent = round(100*level/out_of)) %>% 
    glue_data(
      "<div class = 'skill-bar'",
      "style = \"background:linear-gradient(to right,",
      "{bar_color} {width_percent}%,",
      "{bar_background} {width_percent}% 100%)\" >",
      "{skill}",
      "</div>"
    )
}

How do I fix this?

meliapetersen commented 4 years ago

I tried to use the resume.rmd doc to see if I could knit that, but now I am not able to pull my "Industry Experience" data from the CSV. I am getting the error when I run the file:

Error: Column `id` must be length 0 (the number of rows) or one, not 2

It was working before and I have not changed anything in my CSV file that would cause any issue.

sunaynagoel commented 4 years ago

@lecy When I knit the .rmd file it produces only only of output (it should be 2-3 pages). Is there anything I should change to get multiple pages of output?

@lecy After changing save .rmd file with encoding from default UTF-8 to ASCII, i was able to knit the file.

Hi Nayna, I am getting the same issue. Can you explain a little further what you did? I don't understand what you mean when you said you changed the "save .rmd file with encoding from default UTF-8 to ASCII". Thank you!!!

Hello @meliapetersen. Sorry it took me so long to get to my computer. So I went to Rstudio (index.rmd file still open)> file>save with encoding . There are several options. Default is UTF-8, I changed it to ASCII and it worked. When I googled the error I found that other people using mac also have the same issue and changing the type of encoding helped. Let me know if this works.

lecy commented 4 years ago

@sunaynagoel Thanks for the tip on the UTF-8 to ASCII issue. I am not on a mac so I have not run into that problem.

@meliapetersen Did that work for you?

lecy commented 4 years ago

@meliapetersen

I tried to use the resume.rmd doc to see if I could knit that, but now I am not able to pull my "Industry Experience" data from the CSV. I am getting the error when I run the file:

I would need a file and code to be able to assess. Do you have any special characters in your text?

lecy commented 4 years ago

@JaesaR Did you change anything on the CSS file?

Also, did you only preview in the R Studio preview browser? Or did you open it in a real browser? Sometimes it's fine if you view it with a real browser, but looks weird in the preview mode.

If you test the function, it provides the proper output:

<div class = 'skill-bar'style = "background:linear-gradient(to right,#969696 100%,#d9d9d9 100% 100%)" >Microsoft Excel</div>
<div class = 'skill-bar'style = "background:linear-gradient(to right,#969696 90%,#d9d9d9 90% 100%)" >Microsoft PowerPoint</div>
<div class = 'skill-bar'style = "background:linear-gradient(to right,#969696 80%,#d9d9d9 80% 100%)" >Tableau</div>
<div class = 'skill-bar'style = "background:linear-gradient(to right,#969696 70%,#d9d9d9 70% 100%)" >R</div>
<div class = 'skill-bar'style = "background:linear-gradient(to right,#969696 60%,#d9d9d9 60% 100%)" >Microsoft Access</div>
<div class = 'skill-bar'style = "background:linear-gradient(to right,#969696 60%,#d9d9d9 60% 100%)" >SQL</div>

So it has to be a CSS (or document style) issue.

sunaynagoel commented 4 years ago

@lecy There is a link in file to download PDF version of it. I could not understand it enough to change it so when I click on the link, it generates the PDF of my current file. Even after changing the path it keep generating PDF of original author's CV.

sunaynagoel commented 4 years ago

@sunaynagoel Thanks for the tip on the UTF-8 to ASCII issue. I am not on a mac so I have not run into that problem.

@meliapetersen Did that work for you?

I am glad, I was able to help.

lecy commented 4 years ago

@sunaynagoel

When I knit the .rmd file it produces only only of output (it should be 2-3 pages). Is there anything I should change to get multiple pages of output?

So I understand, are you missing data in your CV? Or it included all of your data and it is not long enough, so you are asking whether the 2-3 pages is mandatory?

Did you use the resume option? That limits output to 2 pages, and you must tag all of your data as "include on resume" in the CSV file.

lecy commented 4 years ago

@sunaynagoel In the RMD file you will see this section:

# When in export mode the little dots are unaligned, so fix that. 
if(PDF_EXPORT){
  cat("View this CV online with links at _nickstrayer.me/cv_")
} else {
  cat("[<i class='fas fa-download'></i> Download a PDF of this CV](https://github.com/nstrayer/cv/raw/master/strayer_cv.pdf)")
}

You need to upload your PDF to GitHub first, then you can navigate to it online, and you will see the download button. Right-click and copy that URL.

image

That produces this link: https://github.com/nstrayer/cv/raw/master/strayer_cv.pdf

sunaynagoel commented 4 years ago

@sunaynagoel

When I knit the .rmd file it produces only only of output (it should be 2-3 pages). Is there anything I should change to get multiple pages of output?

So I understand, are you missing data in your CV? Or it included all of your data and it is not long enough, so you are asking whether the 2-3 pages is mandatory?

Did you use the resume option? That limits output to 2 pages, and you must tag all of your data as "include on resume" in the CSV file.

I was able to fix the problem. I am using the CV option and I repopulated all the data in .csv form. But I guess the program is very picky. It was treating trailing extra space and " ' " as special character. once I fixed all the special characters, it worked perfectly. Sorry for the confusion.

lecy commented 4 years ago

It's a non-trivial problem. There are hundreds of ways to break your data structures with these problems. For example:

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7

Gene name errors are widespread in the scientific literature

Abstract The spreadsheet software Microsoft Excel, when used with default settings, is known to convert gene names to dates and floating-point numbers. A programmatic scan of leading genomics journals reveals that approximately one-fifth of papers with supplementary Excel gene lists contain erroneous gene name conversions.

It's one of the reasons why some programs have started using JSON format, which provides a little more structure:

album, year, US_peak_chart_post
The White Stripes, 1999, -
De Stijl, 2000, -
White Blood Cells, 2001, 61
Elephant, 2003, 6
Get Behind Me Satan, 2005, 3
Icky Thump, 2007, 2
Under Great White Northern Lights, 2010, 11
Live in Mississippi, 2011, -
Live at the Gold Dollar, 2012, -
Nine Miles from the White City, 2013, -

{
"The White Stripes":{"year":1999,"US_peak_chart_post":"-"},
"De Stijl":{"year":2000,"US_peak_chart_post":"-"},
"White Blood Cells":{"year":2001,"US_peak_chart_post":61},
"Elephant":{"year":2003,"US_peak_chart_post":6},
"Get Behind Me Satan":{"year":2005,"US_peak_chart_post":3},
"Icky Thump":{"year":2007,"US_peak_chart_post":2},
"Under Great White Northern Lights":{"year":2010,"US_peak_chart_post":11},
"Live in Mississippi":{"year":2011,"US_peak_chart_post":"-"},
"Live at the Gold Dollar":{"year":2012,"US_peak_chart_post":"-"},
"Nine Miles from the White City":{"year":2013,"US_peak_chart_post":"-"}
}
sunaynagoel commented 4 years ago

@lecy I have a general question. After the GitHub repository is created what is a better practice to edit .rmd file?

  1. Locally on our computers and commit changes through desktop GitHub ?
  2. Directly in GitHub using the editor?
JaesaR commented 4 years ago

@lecy I'm a bit confused about how to submit the URL to our resume. When I knit the file and open in a web browser I get the following link: file:///C:/Users/jtroger1/Documents/GitHub/CV/cv/resume.html Which is just a file on my computer, and I don't believe you would have access to it. I tried editing the following:

# When in export mode the little dots are unaligned, so fix that. 
if(PDF_EXPORT){
  cat("View this CV online with links at _jaesa.me/cv_")
} else {
  cat("[<i class='fas fa-download'></i> Download a PDF of this CV](https://github.com/nstrayer/cv/raw/master/strayer_cv.pdf)")
}

but I don't know how to make my resume into a pdf to add that pathway. What should I do here?

meliapetersen commented 4 years ago

@meliapetersen

I tried to use the resume.rmd doc to see if I could knit that, but now I am not able to pull my "Industry Experience" data from the CSV. I am getting the error when I run the file:

I would need a file and code to be able to assess. Do you have any special characters in your text?

@lecy No, I don't have any special characters. I'm going to email you both of my docs.

lecy commented 4 years ago

@JaesaR One of the take-aways of this exercise is figuring out how to share work.

R Markdown gives you access to a publishing platform that can package your analysis in a variety of formats. The problem is you need a good way to share these products.

GitHub Pages is a powerful and free platform that allows us to create custom websites. They have simplified the process by allowing users to write content in markdown, then on the back-end GitHub will convert each MD file to an HTML file, and provide a URL.

RMD will allow you to create HTML documents that you can email and share, but your are correct that people can't view the file using your local path describing your directory structure:

file:///C:/Users/jtroger1/Documents/GitHub/CV/cv/resume.html

After you activate the GitHub pages option in your CV repository you should have a link like this:

https://ds4ps.github.io/cv

Your rendered index file will show up here:

https://ds4ps.github.io/cv/index.html

BASE USER URL / REPO NAME / FILE NAME

For the PDF version, Strayer is just saving the HTML file as a PDF in Chrome:

strayer_cv.pdf: The final exported pdf as rendered by Chrome on my mac laptop. Links are put in footer and notes about online version are added.

Once your CV is uploaded to GitHub, you can right-click on the download button and get the CV link.

(https://github.com/nstrayer/cv/raw/master/strayer_cv.pdf)")
castower commented 4 years ago

Was anyone able to solve the following?

Error: Column `id` must be length 0 (the number of rows) or one, not 2

I've checked for special characters and can't seem to find any yet, it keeps getting stuck at this place. I'm not sure what to do?

castower commented 4 years ago

Was anyone able to solve the following?

Error: Column `id` must be length 0 (the number of rows) or one, not 2

I've checked for special characters and can't seem to find any yet, it keeps getting stuck at this place. I'm not sure what to do?

As soon as I posted this, I figured it out!

If anyone else is having this problem, it's looking for a 'positions' section. Therefore, you can either re-name the teaching and research positions to be 'positions' or you can change the formula to work accordingly (i.e. add the proper positions to line 39...I work in education and teaching positions weren't an option, so I had to add it!).

lecy commented 4 years ago

At the top of the resume.Rmd file there is this code that changes variable names.

# First let's get the data, filtering to only the items tagged as
# Resume items
position_data <- read_csv('positions.csv') %>% 
  filter(in_resume) %>% 
  mutate(
    # Build some custom sections by collapsing others
    section = case_when(
      section %in% c('industry_positions', 'leadership_positions') ~ 'positions', 

      TRUE ~ section
    )
  ) 

If you change this:

position_data %>% print_section( 'industry_positions' )

To this:

position_data %>% print_section( 'positions' )

It should work.

JaesaR commented 4 years ago

@lecy I am still having trouble connecting my resume to to the github pages site. Currently I have a site up and running: https://jaesar.github.io/CV/ but as you can see when you click on it, it shows you Nick Strayer's CV. I'm confused because I've edited the resume.rmd, index.rmd, and positions.csv, and have committed all of the changes. My repository can be found here: https://github.com/JaesaR/CV

RickyDuran commented 4 years ago

@JaesaR I was running into a similar issue, but you have to pull the index.html or resume.html file: https://rickyduran.github.io/cv/index.html

your CV: https://jaesar.github.io/CV/index.html your resume: https://jaesar.github.io/CV/resume.html

JaesaR commented 4 years ago

@RickyDuran Wow this is super helpful! Thanks, Ricky!

castower commented 4 years ago

@JaesaR if you want your resume to appear on the front page, you can rename your resume file to file to index.html and call your CV file something else (like CV.html) it'll automatically show your resume.

lecy commented 4 years ago

That's the main difference between DropBox and GitHub.

In DropBox if you save a file on your computer in a DropBox folder it automatically sends those changes to the cloud, and then the cloud pushes all of those changes to all of the computers that have your DropBox account installed, or collaborators that share that folder. Everything is supposed to be real-time.

In computer science, you only want to sync everything once you know the code is working. If it functioned the same was as DropBox, when you saved your work before going to bed it would update the code everywhere, and thus constantly introduce bugs and everything would break.

With code you need to be able to stage your testing. Typically you develop code with some test data until it seems to be working properly. Then you might integrate that new code on a dummy server that has a copy of the code for your website or platform and test it while it's operating in the actual environment.

Once you are certain it works then you integrate the code into your live server version. Otherwise engineers would constantly be breaking Facebook, your bank account, the internet...

So "commit" is like saving changes locally, and you need to push them to GitHub to fully integrate the code or your rendered RMD / HTML files into the cloud version.