Dance-Data-Project / smith-capstone-23

MIT License
0 stars 1 forks source link

Create GH Action to create HTML artifacts from RMarkdown #17

Open ajhoekst opened 1 year ago

ajhoekst commented 1 year ago

Note: we might consider publishing to Github Pages. Would be good to ensure Liza is 100% ok with this.

ajhoekst commented 1 year ago

Made preliminary action. Current workflow fails probably due to no R packages installed in the environment.

ajhoekst commented 1 year ago

Doh! I forgot that the repository doesn't have access to the data. There is no way to build the results without it.

Pausing for now...solution could be to host the files somewhere behind credentials that are stored as a pipeline secret.

ajhoekst commented 1 year ago

@raevard @q-w-a I was able to make significant progress towards this goal. However, I'm running into an error when the server executes that I cannot replicate locally.

Have any ideas why the ReturnTS object can't be found when trying as.Date()? Reference to the line

Full logs for the run on the server

Note: I proved to myself that the ZIP file was downloaded and extracted.

q-w-a commented 1 year ago

Not sure immediately, as I'm also not able to create the error locally. I added two things to try -- adding a glimpse(all_data) to see if the problem is that the extraction failed and we have an empty data frame (which would produce an error like that) and also changed as.Date() to as_date from lubridate in my experience has been more reliable (I'm really doubting it's the function that's the issue though; I suspect the data frame is empty and that's why we're getting the error)

raevard commented 1 year ago

Object can't be found typically originates from a loading issue where the object is improperly loaded in as empty, or in a form that we don't expect. My first guess would be something happens when the server executes lines 160-162, where it doesn't properly parse ReturnTs or even more of the variables. Another guess is that, it parses properly, but it doesn't follow the variable naming scheme we expect. Like Quinn just said, my third thought would be switching to as_date from lubridate.

raevard commented 1 year ago

Another thought I just had; there's a possibility it's an issue with the pathways. You mention that the data need to be stored somewhere with credentials, but you were able to unzip and extract it. The file pathways which Quinn and I utilized for us were made for local Mac OS style pathways. I wonder if there's something in the get_df function which assumes a local Mac OS filing system (with \ slashes instead of /), which then eventually results in an empty dataset.

raevard commented 1 year ago

Follow up on my previous comment: if you unzipped and extracted not utilizing the get_df function, it would make sense why you were able to unzip it, unable to find the objects later on after utilizing get_df.

ajhoekst commented 1 year ago

@q-w-a glimpse is a good idea! Adding it now to see if this helps me hone in.

@raevard Great thought on the get_df extract. When I said extract, I meant unzipping as in extracting from the ZIP archive. I'm guessing the Mac vs. Linux paths won't be an issue since they use the same convention...and besides, this branch has a slight update for file path references to avoid specifying a slash direction.

Thank you!

q-w-a commented 1 year ago

Okay, so for the last commit I changed error = TRUE and only set it to render the first file so we can see what the messages and errors are more clearly. It looks like the vector files is empty, which results in an empty data frame, so there must be something going on with the paths to the unzipped file

q-w-a commented 1 year ago

referring to the part:

files <- dir( Sys.getenv("ARCHIVE_NAME"), full.names = TRUE, pattern = ".*\\.xml$")
print(files)

and the print statement showed the length of this files vector is zero