imperialCHEPI / healthgps-plots

A set of scripts for plotting and visualising data related to Health-GPS
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Plots: average by age, gender over multiple output CSVs #1

Closed jamesturner246 closed 8 months ago

jamesturner246 commented 9 months ago

The current R plotting script Visualization.R at https://github.com/imperialCHEPI/healthgps-plots takes in a fixed number of output CSV files and averages data over age and gender, weighted by group size, separated into baseline and intervention scenarios.

It would be useful to allow an arbitrary number of files to be fed in and averaged in this way. This would allow us to plot results from HPC array jobs, with multiple output files for each job array index.

as before, we should be careful to weight the average by count for each group, and ensure that baseline and intervention data points are separated correctly -- the ordering of baseline and intervention results is not guaranteed.

jzhu20 commented 9 months ago

@TinyMarsh My updated R script can be found here, with commands to import multiple csv files at once. https://imperiallondon-my.sharepoint.com/:u:/r/personal/jjo11_ic_ac_uk/Documents/CHEPI/research/franco/Health-GPS_SHARED/Health-GPS_INDIA/Visualization/Visualization.R?csf=1&web=1&e=oNcjqB

TinyMarsh commented 9 months ago

Thanks @jzhu20 . I wondered if you fancied having a go at creating a pull request so I can review and merge your changes? If you want, we can go through this together perhaps tomorrow? Alternatively, if time is an issue, then I can just do it myself and that's not a problem.

jzhu20 commented 9 months ago

@TinyMarsh I just created a pull request and added you as reviewer. I'm happy to go through it together. I can make 1-2pm tomorrow, otherwise next Monday from 11am onwards.

jamesturner246 commented 9 months ago

Could you give a link to the PR? I'll have a look through. Thanks!

jzhu20 commented 9 months ago

Could you give a link to the PR? I'll have a look through. Thanks!

https://github.com/imperialCHEPI/healthgps-plots/pull/3 Thanks.

jzhu20 commented 9 months ago

@TinyMarsh @jamesturner246 I got this error message when I run

Rscript Visualization.R C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S1.csv C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S2.csv C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S4.csv C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S5.csv

collapse 2.0.9, see ?collapse-package or ?collapse-documentation

Attaching package: 'collapse'

The following object is masked from 'package:stats':

D

Error in file(file, "rt") : cannot open the connection Calls: read.csv -> read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file 'C:\Users\jzhu5\OneDrive': Permission denied Execution halted

TinyMarsh commented 9 months ago

@jzhu20 it looks like you need to wrap the file locations in quotes.

Try this;

Rscript Visualization.R "C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S1.csv" "C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S2.csv" "C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S4.csv" "C:\Users\jzhu5\OneDrive - Imperial College London\Project\RSTL_India\Visualization\From 2023\HealthGPS_Result_S5.csv"

OneDrive automatically creates spaces in its directory names which is pretty horrible.