Closed gdevenyi closed 7 years ago
To my knowledge this isn't implemented yet but perhaps @jasonlerch knows.
If it isn't implemented yet it is certainly do-able. I'm the label, jacobian, and sum cases you just need to add the rows together according to the hierarchy. In the mean case it's a bit trickier because you need to keep track of the number of voxels in each label to compute the correct average. This would require changes to the underlying c/c++ code.
For now, the question is restricted to the MAGeT label outputs.
I think this could be a simple function, I'll try to find time to write it next week.
I sketched a function to do this, definitely needs a little tlc, but you could try this if you don't want to wait for the next release. The function can either take a column name and a definitions file or a two
column data.frame with columns label
and group
.
library(dplyr); library(tidyr)
anatSummarize <-
function(anat
, summarize_by = "hierarchy"
, defs = getOption("RMINC_LABEL_DEFINITIONS")
, discard_missing = FALSE){
if(is.character(summarize_by) && length(summarize_by == 1)){
labels <- read.csv(defs, stringsAsFactors = FALSE)
summarize_by <-
labels %>%
select_("Structure", summarize_by) %>%
rename_("label" = "Structure", "group" = summarize_by)
}
if(!discard_missing)
summarize_by <-
summarize_by %>%
mutate_(group = ~ifelse(is.na(group) | group == "", label, group))
as.data.frame.matrix(anat) %>%
(tibble::rownames_to_column) %>%
gather_("label", "value", setdiff(colnames(anat), "rowname")) %>%
inner_join(summarize_by, by = "label") %>%
group_by_("group", "rowname") %>%
summarize_(value = ~ sum(value)) %>%
spread_("group", "value") %>%
as.matrix
}
Thanks! We'll give it a try.
So, something seems to go off the rails here, it works, then it doesn't:
anat <- anatGetAll(data$label, "/opt/quarantine/resources/DorrSteadmanUllmann/MAGeT/Dorr_2008_Steadman_2013_Ullmann_2013_on_NRXN1a_v1_labels.mnc", method="labels", defs="/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
data$volCombined <- anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
We get the first "hierarchy" from the CSV, but after that things go weird in the outputs.
The outputs we got back as a summary:
volCombined.cerebellar vermis volCombined.cerebellar vermis white matter volCombined.cerebellar white matter volCombined.cerebral aqueduct volCombined.fourth ventricle volCombined.interpedunclar nucleus volCombined.medulla volCombined.midbrain volCombined.periaqueductal grey volCombined.pons volCombined.posterior commissure volCombined.third ventricle volCombined.ventral tegmental decussation
So the issue here is that not all structures are getting returned?
The outputs after the first cerebellar vermis label are from the "structure" column, not the "hierarchy" column.
Missed on copy and paste, first "combined" return is: volCombined.rowname
I'd never intended the results to be put as a column in a data frame, I intended something along the lines of
volCombined <- anatSummarize(...)
Which can be coerced to a data frame and and merged/joined to your original data frame if you wish. Rowname is the subject. The discard_missing
argument configures whether to keep or discard structures not found in the hierarchy, perhaps setting that to TRUE will fix the problem?
Results: no change vs. saving to a column in a dataframe.
discard_missing = TRUE throws error:
> volCombined <- anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv", discard_missing = TRUE)
Error: All columns must be named
Called from: as_data_frame(data)
Hmm, just tried it on my machine and it work fine. Could you call traceback()
after the error? I'm not sure where as_data_frame(data) is getting called. Also if you could give me the output of sessionInfo()
it could be a dplyr/tidyr versioning thing.
volCombined <- anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv", discard_missing = TRUE)
Error: All columns must be named
Called from: as_data_frame(data)
Browse[1]> traceback()
No traceback available
Browse[1]> sessionInfo()
R version 3.2.4 (2016-03-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8
[4] LC_COLLATE=en_CA.UTF-8 LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidyr_0.4.1 dplyr_0.4.3 ggplot2_2.1.0 RMINC_1.4.3.4
[5] BatchJobs_1.6 BBmisc_1.9 lmerTest_2.0-30 lme4_1.1-11
[9] Matrix_1.2-4 RevoUtilsMath_3.2.4
loaded via a namespace (and not attached):
[1] splines_3.2.4 lattice_0.20-33 colorspace_1.2-6 htmltools_0.3.5
[5] base64enc_0.1-3 survival_2.38-3 nloptr_1.0.4 foreign_0.8-66
[9] DBI_0.3.1 RColorBrewer_1.1-2 plyr_1.8.3 stringr_1.0.0
[13] munsell_0.4.3 gtable_0.2.0 latticeExtra_0.6-28 httpuv_1.3.3
[17] parallel_3.2.4 Rcpp_0.12.4 acepack_1.3-3.3 xtable_1.8-2
[21] backports_1.0.2 scales_0.4.0 checkmate_1.7.3 Hmisc_3.17-2
[25] mime_0.4 sendmailR_1.2-1 gridExtra_2.2.1 brew_1.0-6
[29] fail_1.3 digest_0.6.9 stringi_1.0-1 shiny_0.13.2
[33] grid_3.2.4 tools_3.2.4 magrittr_1.5 lazyeval_0.1.10
[37] RSQLite_1.0.0 tibble_1.0 Formula_1.2-1 cluster_2.0.3
[41] MASS_7.3-45 gridBase_0.4-7 assertthat_0.1 minqa_1.2.4
[45] R6_2.1.2 rpart_4.1-10 nnet_7.3-12 nlme_3.1-126
[49] compiler_3.2.4
Hmm weird there's no backtrace, given that you have the browse prompt you probably hopped into the debugger, maybe this ate the trace.
I'm running tidyr 0.6 and dplyr 0.5, so it is possible it is package version problem
Will check into that, thanks
Updated versions to latest for tidyr and dplyr, no errors anymore, however outputs aren't yet as expected.
Without discard:
"","rowname","cerebellar vermis","cerebellar vermis white matter","cerebellar white matter","cerebral aqueduct","fourth ventricle","interpedunclar nucleus","medulla","midbrain","periaqueductal grey","pons","posterior commissure","third ventricle","ventral tegmental decussation"
"1","1","13.265","1.862","3.755","0.455","0.711","0.196","26.331","11.979","4.788","16.888","0.238","1.563","0.139"
"2","10","13.810","1.943","3.706","0.471","0.649","0.197","26.336","11.494","4.881","17.203","0.232","1.501","0.122"
"3","11","14.193","2.065","3.975","0.391","0.625","0.193","27.574","12.052","4.882","17.589","0.203","1.176","0.165"
"4","12","16.265","2.357","3.855","0.419","0.732","0.246","28.528","12.595","4.661","17.922","0.259","1.683","0.107"
"5","13","15.425","2.457","4.316","0.376","0.672","0.236","30.114","14.284","4.857","18.662","0.208","1.769","0.131"
"6","14","15.303","2.350","4.472","0.392","0.704","0.248","29.364","13.661","4.635","19.202","0.241","1.548","0.131"
"7","15","15.633","2.220","3.749","0.431","0.685","0.203","29.056","12.019","4.614","17.710","0.195","1.505","0.146"
"8","16","15.072","2.206","3.913","0.411","0.693","0.162","29.419","12.779","4.728","18.020","0.178","1.299","0.185"
"9","17","16.080","2.285","3.983","0.476","0.728","0.227","30.049","12.341","4.649","17.990","0.256","1.560","0.127"
"10","18","16.060","2.188","4.164","0.494","0.731","0.235","31.930","13.071","4.990","19.077","0.213","1.616","0.142"
"11","19","16.020","2.277","4.360","0.472","0.726","0.252","32.301","12.825","4.659","19.540","0.245","1.559","0.130"
"12","2","14.547","2.097","3.916","0.450","0.694","0.202","29.066","12.604","4.862","18.010","0.203","1.537","0.135"
"13","20","14.782","2.105","3.945","0.402","0.721","0.207","27.978","12.151","4.724","18.244","0.230","1.603","0.120"
"14","21","14.848","2.208","4.425","0.452","0.737","0.230","30.133","13.264","4.803","19.747","0.253","1.697","0.137"
"15","22","14.240","2.350","4.822","0.389","0.713","0.248","30.322","13.272","4.668","19.484","0.255","1.843","0.121"
"16","23","14.492","2.065","4.025","0.504","0.745","0.226","28.033","12.412","4.760","17.718","0.225","1.577","0.111"
"17","24","14.966","2.223","4.520","0.469","0.733","0.254","30.095","13.523","5.030","19.951","0.221","1.725","0.142"
"18","25","14.454","2.282","4.670","0.472","0.776","0.231","30.048","13.626","4.631","19.350","0.241","1.766","0.140"
"19","26","14.854","2.129","3.914","0.440","0.742","0.228","27.686","12.199","4.623","18.487","0.209","2.199","0.138"
"20","27","15.295","2.250","4.390","0.396","0.791","0.243","30.314","13.346","4.732","20.253","0.275","2.451","0.152"
"21","28","14.766","2.229","4.472","0.392","0.763","0.249","30.135","13.506","4.657","20.467","0.255","2.453","0.134"
"22","3","14.297","2.119","3.985","0.433","0.677","0.216","29.653","13.280","4.527","18.207","0.216","1.557","0.124"
"23","4","13.766","1.980","3.608","0.503","0.690","0.192","26.566","11.736","4.719","16.583","0.232","1.445","0.144"
"24","5","14.197","2.057","3.750","0.481","0.666","0.201","27.415","11.677","4.571","16.804","0.215","1.411","0.143"
"25","6","13.960","1.966","4.119","0.487","0.715","0.224","29.656","12.907","4.730","18.231","0.232","1.470","0.140"
"26","7","13.221","1.930","3.621","0.488","0.698","0.172","25.442","11.265","4.338","16.167","0.198","1.233","0.115"
"27","8","13.383","1.887","3.710","0.456","0.696","0.211","26.598","11.567","4.399","17.177","0.157","1.229","0.128"
"28","9","13.648","2.039","3.998","0.419","0.713","0.215","27.518","12.023","4.233","17.691","0.183","1.232","0.122"
And with:
"","rowname","","cerebellar vermis","cerebellar vermis white matter","cerebellar white matter"
"1","1","63.288","13.265","1.862","3.755"
"2","10","63.086","13.810","1.943","3.706"
"3","11","64.850","14.193","2.065","3.975"
"4","12","67.152","16.265","2.357","3.855"
"5","13","71.309","15.425","2.457","4.316"
"6","14","70.126","15.303","2.350","4.472"
"7","15","66.564","15.633","2.220","3.749"
"8","16","67.874","15.072","2.206","3.913"
"9","17","68.403","16.080","2.285","3.983"
"10","18","72.499","16.060","2.188","4.164"
"11","19","72.709","16.020","2.277","4.360"
"12","2","67.763","14.547","2.097","3.916"
"13","20","66.380","14.782","2.105","3.945"
"14","21","71.453","14.848","2.208","4.425"
"15","22","71.315","14.240","2.350","4.822"
"16","23","66.311","14.492","2.065","4.025"
"17","24","72.143","14.966","2.223","4.520"
"18","25","71.281","14.454","2.282","4.670"
"19","26","66.951","14.854","2.129","3.914"
"20","27","72.953","15.295","2.250","4.390"
"21","28","73.011","14.766","2.229","4.472"
"22","3","68.890","14.297","2.119","3.985"
"23","4","62.810","13.766","1.980","3.608"
"24","5","63.584","14.197","2.057","3.750"
"25","6","68.792","13.960","1.966","4.119"
"26","7","60.116","13.221","1.930","3.621"
"27","8","62.618","13.383","1.887","3.710"
"28","9","64.349","13.648","2.039","3.998"
Can you send me an rds file containing your anat? I can debug faster if I have the data.
saveRDS(anat, "anatSummarize_test.rds")
Ok I think it's fixed this time, it wasn't handling the laterality of the structures. I'll push a commit to develop exposing anatSummarize this afternoon if you want to give it a try.
can do!
Ready to go.
Pushed off to testers!
Thanks Gabe, much appreciated.
Still having trouble, different error now:
volCombined <- anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
Error: invalid 'type' (character) of argument
In addition: Warning message:
attributes are not identical across measure variables; they will be dropped
> traceback()
12: stop(list(message = "invalid 'type' (character) of argument",
call = NULL, cppstack = NULL))
11: summarise_impl(.data, dots)
10: summarise_.tbl_df(., value = ~sum(value))
9: summarize_(., value = ~sum(value))
8: function_list[[i]](value)
7: freduce(value, `_function_list`)
6: `_fseq`(`_lhs`)
5: eval(expr, envir, enclos)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
2: anat %>% as.data.frame.matrix %>% (tibble::rownames_to_column) %>%
gather_("label", "value", setdiff(colnames(anat), "rowname")) %>%
inner_join(summarize_by, by = "label") %>% group_by_("group",
"rowname") %>% summarize_(value = ~sum(value)) %>% spread_("group",
"value") %>% arrange_(~as.numeric(rowname)) %>% select_(~-rowname) %>%
as.matrix
1: anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
Could you send me your copy of the DSU labels, you can attach it to the issue, it's possible our csv's are a little different and that's causing troubles.
Attached:
Num,Structure,right label,left label,hierarchy,tissue type
1,amygdala,51,151,,GM
2,anterior commissure: pars anterior,115,215,,WM
3,anterior commissure: pars posterior,23,103,,WM
4,basal forebrain,52,152,,GM
5,bed nucleus of stria terminalis,176,76,,GM
6,cerebellar peduncle: inferior,123,223,,WM
7,cerebellar peduncle: middle,45,245,,WM
8,cerebellar peduncle: superior,242,222,,WM
9,cerebral aqueduct,119,119,,CSF
10,cerebral peduncle,114,14,,WM
11,colliculus: inferior,143,43,,GM
12,colliculus: superior,9,109,,GM
13,corpus callosum,8,68,,WM
14,corticospinal tract/pyramids,218,18,,WM
15,cuneate nucleus,166,168,,GM
16,dentate gyrus of hippocampus,16,66,,GM
17,facial nerve (cranial nerve 7),19,219,,GM
18,fasciculus retroflexus,25,125,,WM
19,fimbria,211,11,,
20,fornix,122,22,,WM
21,fourth ventricle,118,118,,CSF
22,fundus of striatum,54,154,,GM
23,globus pallidus,44,144,,GM
24,habenular commissure,99,199,,WM
25,hippocampus,6,106,,GM
26,hypothalamus,250,150,,GM
27,inferior olivary complex,113,213,,GM
28,internal capsule,112,12,,WM
29,interpedunclar nucleus,157,157,,GM
30,lateral olfactory tract,101,102,,WM
31,lateral septum,207,107,,
32,lateral ventricle,57,77,,CSF
33,mammillary bodies,161,61,,GM
34,mammilothalamic tract,210,212,,WM
35,medial lemniscus/medial longitudinal fasciculus,20,120,,WM
36,medial septum,53,153,,
37,medulla,174,174,,
38,midbrain,194,194,,
39,nucleus accumbens,55,155,,GM
40,olfactory bulbs,5,105,,GM
41,olfactory tubercle,95,145,,GM
42,optic tract,216,116,,WM
43,periaqueductal grey,10,10,,GM
44,pons,187,187,,GM
45,pontine nucleus,85,185,,GM
46,posterior commissure,100,100,,WM
47,pre-para subiculum,133,131,,GM
48,stratum granulosum of hippocampus,13,63,,GM
49,stria medullaris,225,205,,GM
50,stria terminalis,59,159,,GM
51,striatum,7,17,,GM
52,subependymale zone / rhinocele,240,140,,GM
53,superior olivary complex,124,214,,GM
54,thalamus,204,4,,GM
55,third ventricle,146,146,,CSF
56,ventral tegmental decussation,156,156,,WM
57,lobules 1-2: lingula and central lobule (ventral),32,32,cerebellar vermis,GM
58,lobule 3: central lobule (dorsal),233,233,cerebellar vermis,GM
59,lobules 4-5: culmen (ventral and dorsal),34,34,cerebellar vermis,GM
60,lobule 6: declive,36,36,cerebellar vermis,GM
61,lobule 7: tuber (or folium),237,237,cerebellar vermis,GM
62,lobule 8: pyramis,38,38,cerebellar vermis,GM
63,lobule 9: uvula,239,239,cerebellar vermis,GM
64,lobule 10: nodulus,40,40,cerebellar vermis,GM
65,anterior lobule (lobules 4-5),90,148,cerebellar paravermis,GM
66,simple lobule (lobule 6),191,91,cerebellar hemisphere,GM
67,crus 1: ansiform lobule (lobule 6),92,192,cerebellar hemisphere,GM
68,crus 2: ansiform lobule (lobule 7),193,93,cerebellar hemisphere,GM
69,paramedian lobule (lobule 7),94,200,cerebellar hemisphere,GM
70,copula: pyramis (lobule 8),196,96,cerebellar hemisphere,GM
71,flocculus (FL),97,197,cerebellar flocculonodular lobe,GM
72,paraflocculus (PFL),198,98,cerebellar flocculonodular lobe,GM
73,trunk of arbor vita,47,47,cerebellar white matter,WM
74,lobule 1-2 white matter,232,232,cerebellar vermis white matter,WM
75,lobule 3 white matter,33,33,cerebellar vermis white matter,WM
76,trunk of lobules 1-3 white matter,253,253,cerebellar vermis white matter,WM
77,lobules 4-5 white matter,234,234,cerebellar vermis white matter,WM
78,lobules 6-7 white matter,236,236,cerebellar vermis white matter,WM
79,lobule 8 white matter,238,238,cerebellar vermis white matter,WM
80,trunk of lobules 6-8 white matter,254,254,cerebellar vermis white matter,WM
81,lobule 9 white matter,139,139,cerebellar vermis white matter,WM
82,lobule 10 white matter,252,252,cerebellar vermis white matter,WM
83,anterior lobule white matter,21,31,cerebellar paravermis white matter,
84,simple lobule white matter,241,251,cerebellar white matter,WM
85,crus 1 white matter,220,170,cerebellar white matter,WM
86,trunk of simple and crus 1 white matter,226,246,cerebellar white matter,WM
87,crus 2 white matter,229,249,cerebellar white matter,WM
88,paramedian lobule,228,248,cerebellar white matter,WM
89,trunk of crus 2 and paramedian white matter,175,195,cerebellar white matter,WM
90,copula white matter,224,244,cerebellar white matter,WM
91,paraflocculus white matter,183,243,cerebellar flocculonodular lobe white matter,WM
92,flocculus white matter,167,177,cerebellar flocculonodular lobe white matter,WM
93,dentate nucleus,1,201,deep cerebellar nuclei,GM
94,nucleus interpositus,203,3,deep cerebellar nuclei,GM
95,fastigial nucleus,15,206,deep cerebellar nuclei,GM
96,Cingulate cortex: area 24a,24,169,Cingulate region,GM
97,Cingulate cortex: area 24a',26,171,Cingulate region,GM
98,Cingulate cortex: area 24b,27,172,Cingulate region,GM
99,Cingulate cortex: area 24b',28,173,Cingulate region,GM
100,Cingulate cortex: area 25,29,178,Cingulate region,GM
101,Cingulate cortex: area 29a,30,179,Cingulate region,GM
102,Cingulate cortex: area 29b,35,182,Cingulate region,GM
103,Cingulate cortex: area 29c,37,184,Cingulate region,GM
104,Cingulate cortex: area 30,39,186,Cingulate region,GM
105,Cingulate cortex: area 32,41,188,Cingulate region,GM
106,Amygdalopiriform transition area,42,189,Piriform cortex area,GM
107,Primary auditory cortex,46,208,Temporal region,GM
108,Secondary auditory cortex: dorsal area,48,217,Temporal region,GM
109,Secondary auditory cortex: ventral area,49,221,Temporal region,GM
110,Caudomedial entorhinal cortex,50,227,Entorhinal cortex,GM
111,Cingulum ,56,231,Fiber Groups,WM
112,Claustrum,58,235,Insular claustrum,GM
113,Cortex-amygdala transition zones,60,255,Piriform cortex area,GM
114,Claustrum: dorsal part,62,256,Insular claustrum,GM
115,Dorsal nucleus of the endopiriform,65,257,Endopiriform claustrum,GM
116,Dorsal intermediate entorhinal cortex,67,258,Entorhinal cortex,GM
117,Dorsolateral entorhinal cortex,69,259,Entorhinal cortex,GM
118,Dorsolateral orbital cortex,70,260,Frontal region,GM
119,Dorsal tenia tecta,71,261,Entorhinal cortex,GM
120,Ectorhinal cortex,72,262,Insular region,GM
121,Frontal cortex: area 3,73,263,Frontal region,GM
122,Frontal association cortex,74,264,Frontal region,GM
123,Intermediate nucleus of the endopiriform claustrum,75,265,Endopiriform claustrum,GM
124,Insular region: not subdivided,78,266,Insular region,GM
125,Lateral orbital cortex,79,267,Frontal region,GM
126,Lateral parietal association cortex,80,268,Parietal region,GM
127,Primary motor cortex,81,269,Frontal region,GM
128,Secondary motor cortex,82,270,Frontal region,GM
129,Medial entorhinal cortex,83,271,Entorhinal cortex,GM
130,Medial orbital cortex,84,272,Frontal region,GM
131,Medial parietal association cortex,86,273,Parietal region,GM
132,Piriform cortex,87,274,Piriform cortex area,GM
133,Posterolateral cortical amygdaloid area ,88,275,Piriform cortex area,GM
134,Posteromedial cortical amygdaloid area ,89,276,Piriform cortex area,GM
135,Perirhinal cortex,104,277,Insular region,GM
136,Parietal cortex: posterior area: rostral part,108,278,Parietal region,GM
137,Rostral amygdalopiriform area,110,279,Piriform cortex area,GM
138,Primary somatosensory cortex,111,280,Parietal region,GM
139,Primary somatosensory cortex: barrel field,117,281,Parietal region,GM
140,Primary somatosensory cortex: dysgranular zone,121,282,Parietal region,GM
141,Primary somatosensory cortex: forelimb region,126,283,Parietal region,GM
142,Primary somatosensory cortex: hindlimb region,127,284,Parietal region,GM
143,Primary somatosensory cortex: jaw region,128,285,Parietal region,GM
144,Primary somatosensory cortex: shoulder region,129,286,Parietal region,GM
145,Primary somatosensory cortex: trunk region,132,287,Parietal region,GM
146,Primary somatosensory cortex: upper lip region,134,288,Parietal region,GM
147,Secondary somatosensory cortex,135,289,Parietal region,GM
148,Temporal association area,136,290,Temporal region,GM
149,Primary visual cortex,137,291,Occipital region,GM
150,Primary visual cortex: binocular area,138,292,Occipital region,GM
151,Primary visual cortex: monocular area,141,293,Occipital region,GM
152,Secondary visual cortex: lateral area,142,294,Occipital region,GM
153,Secondary visual cortex: mediolateral area,147,295,Occipital region,GM
154,Secondary visual cortex: mediomedial area,149,296,Occipital region,GM
155,Claustrum: ventral part,158,297,Insular claustrum,GM
156,Ventral nucleus of the endopiriform claustrum,160,298,Endopiriform claustrum,GM
157,Ventral intermediate entorhinal cortex,162,299,Entorhinal cortex,GM
158,Ventral orbital cortex,163,300,Frontal region,GM
159,Ventral tenia tecta,165,301,Entorhinal cortex,GM
Hmm, with your label set it seems to be working both on our ubuntu 12.04, and centos 5 installs. Both are using R 3.3.2 where it appears you are using R 3.2.4. Did you end up upgrading dplyr and tidyr?
I wonder if something has changed with ~
quotation since 3.2.4, it's complaining that sum has received a character argument, but normal non-standard evaluation should interpolate value
to a numeric vector generated by gather_
.
To test that this is indeed the problem I made a branch gabe_nse_fix20170512 that uses a different summarization technique. If that solves your problem I'm not precisely sure how to proceed, this version is slower and less clear, but has a longer compatibility window maybe...
I did upgrade dplyr and tidyr.
I'm happy to upgrade our R install to get this working, but I will test with the fix branch as well. Thanks!
Still having issues... git-develop with updated R
volCombined <- anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
Error in summarise_impl(.data, dots) :
invalid 'type' (character) of argument
In addition: Warning message:
attributes are not identical across measure variables; they will be dropped
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_CA.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RMINC_1.4.3.4 BatchJobs_1.6 BBmisc_1.11 RevoUtilsMath_10.0.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.9 magrittr_1.5 splines_3.3.3 MASS_7.3-45 gridBase_0.4-7 xtable_1.8-2 lattice_0.20-34 R6_2.2.0 brew_1.0-6 minqa_1.2.4 sendmailR_1.2-1
[12] stringr_1.2.0 dplyr_0.5.0 tools_3.3.3 fail_1.3 parallel_3.3.3 grid_3.3.3 checkmate_1.8.2 nlme_3.1-131 DBI_0.6 htmltools_0.3.5 lazyeval_0.2.0
[23] yaml_2.1.14 lme4_1.1-12 digest_0.6.12 assertthat_0.1 tibble_1.2 Matrix_1.2-8 shiny_1.0.0 tidyr_0.6.1 nloptr_1.0.4 base64enc_0.1-3 mime_0.5
[34] memoise_1.0.0 RSQLite_1.1-2 stringi_1.1.2 RevoUtils_10.0.3 backports_1.0.5 httpuv_1.3.3
Did you try the branch I made for you?
Also can you try running the offending code:
summarize_by <- "hierarchy"
defs <- "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv"
summarize_by <-
RMINC:::create_labels_frame(defs) %>%
select_(~ -label) %>%
rename_(label = "Structure", group = summarize_by)
summarize_by <-
summarize_by %>%
mutate_(group = ~ifelse(is.na(group) | group == "", label, group))
partial <-
anat %>%
as.data.frame.matrix %>%
(tibble::rownames_to_column) %>%
gather_("label", "value", setdiff(colnames(anat), "rowname")) %>%
inner_join(summarize_by, by = "label") %>%
group_by_("group", "rowname")
## This is the standard evaluation version, should fail
partial %>%
summarize_(value = ~ sum(value))
## This is the non-standard evaluation version, might work
partial %>%
summarize(value = sum(value))
## Check that value is numeric
glimpse(partial)
No not yet, this is with updated R (see sessioninfo)
From the fixes branch:
volCombined <- anatSummarize(anat, summarize_by = "hierarchy", defs = "/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
Error in sum(.$value) : invalid 'type' (character) of argument
In addition: Warning message:
attributes are not identical across measure variables; they will be dropped
Weird thing I just noticed, anat
looks like it has characters instead of numbers:
head(anat)
right dentate nucleus left nucleus interpositus left thalamus right olfactory bulbs right hippocampus right striatum right corpus callosum right colliculus: superior periaqueductal grey
[1,] " 0.075" " 0.113" " 9.013" "13.505" " 9.344" "10.503" " 8.029" " 4.398" " 4.788"
[2,] " 0.079" " 0.122" " 9.661" "14.289" " 9.559" "10.827" " 8.493" " 4.571" " 4.862"
[3,] " 0.094" " 0.119" " 9.404" "14.651" " 9.623" "10.767" " 7.971" " 4.395" " 4.527"
Notice the spaces inside the number boxes. Maybe this is the bug origin?
rstudio says so too, see upper right
On my side - reading the rds you sent me anat
is a numeric matrix, actually mine is an anatUnilateral
which under the hood is a numeric matrix. Yours also has an extra row relative to the one you sent me. Did you happen to write it out and read it in from a csv? And yes, this is definitely why it is not working, it can't sum character vectors.
Indeed that file is.
This is with updated R and rebuilt RMINC based on develop. Clearly something changed :-/
anatGetAll_old works properly.
Looks like a bug in anatGetAll method=labels?
Does it still fail if you do not write to csv first? By which I mean write out - read back in, because that will likely create weirdness.
This produces a character array:
data = read.csv("/data/chamal/projects/colleen/FAT/MAGeT/cohort1_R.csv")
data$mouseID = as.factor(data$mouseID)
data$sex = as.factor(data$sex)
data$sex = relevel(data$sex, ref = "M")
data$timePoint = as.factor(data$timePoint)
data$timePoint = relevel(data$timePoint, ref = "1")
data$group = as.factor(data$group)
data$group = relevel(data$group, ref = "3tgCD")
## pulls all the volumes for all the labels, which are defined (which number means what) by a csv that lives elsewhere
anat <- anatGetAll(data$label, "/opt/quarantine/resources/DorrSteadmanUllmann/MAGeT/Dorr_2008_Steadman_2013_Ullmann_2013_on_NRXN1a_v1_labels.mnc", method="labels", defs="/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
This does not
data = read.csv("/data/chamal/projects/colleen/FAT/MAGeT/cohort1_R.csv")
data$mouseID = as.factor(data$mouseID)
data$sex = as.factor(data$sex)
data$sex = relevel(data$sex, ref = "M")
data$timePoint = as.factor(data$timePoint)
data$timePoint = relevel(data$timePoint, ref = "1")
data$group = as.factor(data$group)
data$group = relevel(data$group, ref = "3tgCD")
## pulls all the volumes for all the labels, which are defined (which number means what) by a csv that lives elsewhere
anat <- anatGetAll_old(data$label, "/opt/quarantine/resources/DorrSteadmanUllmann/MAGeT/Dorr_2008_Steadman_2013_Ullmann_2013_on_NRXN1a_v1_labels.mnc", method="labels", defs="/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv")
We're not writing to CSV anywhere.
On either anat: any(is.na(anat))
?
FALSE and FALSE
hmmm, I have no idea what's going on, on mock data and all of our tests anatGetAll method = "label" faithfully returns numeric. It's likely the conversion is happening in RMINC:::create_anat_results, but I can't trace it.
defs <- "/opt/quarantine/resources/DorrSteadmanUllmann/MAGeT/Dorr_2008_Steadman_2013_Ullmann_2013_on_NRXN1a_v1_labels.mnc"
## This should be numeric
cpp_res <- RMINC:::label_counts(data$label, defs = defs, method="labels", defs=defs, strict = TRUE)
label_frame <- RMINC:::create_labels_frame(defs, side = "both")
## Problem should be here...
res <- RMINC:::create_anat_results(cpp_res, label_frame, filenames = data$label, atlas = NULL)
Can you confirm that that is actually where the issue is occuring, if it's happening in the c++ code something even weirder is happening
res indeed contains chr data with spaces instead of numeric.
One fix, I had to do
label_frame <- RMINC:::create_labels_frame("/opt/quarantine/resources/DorrSteadmanUllmann/Dorr_2008_Steadman_2013_Ullmann_2013_mapping_of_labels.csv", side = "both")
Otherwise error:
label_frame <- RMINC:::create_labels_frame(defs, side = "both")
Error in make.names(col.names, unique = TRUE) :
invalid multibyte string at '<89>HDF'
In addition: Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 3 appears to contain embedded nulls
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 4 appears to contain embedded nulls
3: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 5 appears to contain embedded nulls
More info: This is with minc-toolkit/1.9.15
There were some typos in the above code - defs should have been the text not the minc file. Also for the new anatGetAll an atlas doesn't need to be passed. I'm still not sure where coercion to character is happening...
Can you send me an rds of cpp_res? Sorry this is taking so long to debug.
anatGetAll_old throws the following warnings, while anatGetAll is silent:
[1] "WARNING: Labels found in the mapping, but not in the file: 199"
[1] "WARNING: Labels found in the mapping, but not in the file: 199"
[1] "WARNING: Labels found in the mapping, but not in the file: 199"
[1] "WARNING: Labels found in the mapping, but not in the file: 199"
[1] "WARNING: Labels found in the mapping, but not in the file: 199"
[1] "WARNING: Labels found in the mapping, but not in the file: 199"
Perhaps this may be an issue.
Also, last subject of anatGetAll has text instead of numbers at all, see data viewer last line
Thanks Gabe, that's definitely the issue, not sure how it got there, but that's at least a step in the right direction
Can you send that rds?
Sent!
Sorry Gabe, can you send cpp_res from above?
Nevermind, found the problem. I changed create_label_frame
to make it useful in anatSummarize
but this broke anatGetAll
when there's a hierarchy column and it wasn't picked up by our test cases. Fixing now.
Should be fixed in develop...?
Sadly no, grabbed/rebuilt HEAD of develop, still broken with characters.
I don't actually see a commit that addresses this? Did you forget to push?
With the more recent DSU and DSUR atlases there are many regions segmented which aren't really distinguishable (or interesting) on a given in-vivio scan.
The atlas CSV files define a hierarchy which combine other labels, but I can't find a place for RMINC to use it.
Is this feature available, or is there a script available to take advantage of this? Thanks!