ciernialab / MicrogliaMorphologyR

R package for microglia morphology analysis. Complimentary to ImageJ macro, MicrogliaMorphology
Other
3 stars 0 forks source link

issue with merging data #1

Closed AlexJTYang closed 6 months ago

AlexJTYang commented 7 months ago

Hey I've been trying to merge both the fraclac and skeleton data into one final cell-level data frame but have been having issues.

I am able to generate the cleaned data for both fraclac and skeleton data frames but whenever I attempt to combine them using data <- merge_data(fraclac, skeleton), the data set it generated with all 30 variables but 0 obs and I keep being returned a "no data available in the table" message the the data frame.

I have been following your readme file as well as the youtube tutorial and still cannot seem to get around this issue

thank you so much.

qpc444 commented 7 months ago

Hey Alex,

I had a similar problem. Mine stemmed from a file naming error. Check that the images that reach the "ThresholdedImages" folder have ".tif_thresholded" added to the end of the original file name. If any images are missing this extension it can cause the issue.

As a note, I was working with an earlier version and I know Jenn has changed some things since, so you may have a separate problem if the file names look okay.

Quincy

AlexJTYang commented 6 months ago

Hey Quincy, thanks for getting back to me.

sadly everything seems to be labeled correctly so I am probably having a different issue than what you had.

I've attached my inputs for exactly how I am attempting to merge everything just in case that would help but I dont think there is anything wrong.

I am still very new to R so I am not sure how to troubleshoot this on my own so any suggestions are extremely helpful!

Alex

devtools::install_github('ciernialab/MicrogliaMorphologyR') Skipping install of 'MicrogliaMorphologyR' from a github remote, the SHA1 (8d81fbc0) has not changed since last install. Use force = TRUE to force installation library(MicrogliaMorphologyR) Loading required package: tidyverse ── Attaching core tidyverse packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 2.0.0 ── ✔ dplyr 1.1.4 ✔ readr 2.1.4 ✔ forcats 1.0.0 ✔ stringr 1.5.1 ✔ ggplot2 3.4.4 ✔ tibble 3.2.1 ✔ lubridate 1.9.3 ✔ tidyr 1.3.0 ✔ purrr 1.0.2
── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() ℹ Use the conflicted package to force all conflicts to become errors Loading required package: Hmisc

Attaching package: ‘Hmisc’

The following objects are masked from ‘package:dplyr’:

src, summarize

The following objects are masked from ‘package:base’:

format.pval, units

Loading required package: pheatmap Loading required package: factoextra Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa Loading required package: lmerTest Loading required package: lme4 Loading required package: Matrix

Attaching package: ‘Matrix’

The following objects are masked from ‘package:tidyr’:

expand, pack, unpack

Attaching package: ‘lmerTest’

The following object is masked from ‘package:lme4’:

lmer

The following object is masked from ‘package:stats’:

step

Loading required package: nlme

Attaching package: ‘nlme’

The following object is masked from ‘package:lme4’:

lmList

The following object is masked from ‘package:dplyr’:

collapse

Loading required package: SciViews Loading required package: ggpubr Loading required package: glmmTMB Loading required package: DHARMa This is DHARMa 0.4.6. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa') Loading required package: rstatix

Attaching package: ‘rstatix’

The following object is masked from ‘package:stats’:

filter

Loading required package: gridExtra

Attaching package: ‘gridExtra’

The following object is masked from ‘package:dplyr’:

combine

library(factoextra) library(ppclust) set.seed(1)

specify directories

fraclac.dir <- "C:/Users/alxer/Documents/BROCKUNIVERSITY/SCHOOLWORK/phd/thesis work/Study 2/immunofluorescence/imageJ microglial morphology/FracLac/20231218112008" skeleton.dir <- "C:/Users/alxer/Documents/BROCKUNIVERSITY/SCHOOLWORK/phd/thesis work/Study 2/immunofluorescence/imageJ microglial morphology/SkeletonResults"

clean up fraclac and skeleton datatsets separately

fraclac <- fraclac_tidying(fraclac.dir) skeleton <- skeleton_tidying(skeleton.dir)

merge both datasets together

data <- merge_data(fraclac, skeleton)
View(data)

qpc444 commented 6 months ago

Interesting. This still sounds like something is going on with the file names to me. Can you send a screenshot of the excel file names for the individual cells (the input folder for skeleton.dir)?

Quincy

AlexJTYang commented 6 months ago

here it is. I am working with a pretty large data set so this is just the first few but they all labelled the same

Alex

image

qpc444 commented 6 months ago

I think the issue could be ".tif (green).tif_thresholded" in the file names. My files, for reference, are Cold_F_MR284_s01.tif_0001-0628.tifresults. Since the program is reading everything at the beginning separated by an underscore and everything after the ".tif####", the additional fields in yours might be throwing them off.

You can copy the first 2 files to a new folder and remove ".tif (green).tif_thresholded" before running the R program again with the new files as the input. You will also need to copy your 3 FracLac files to a new folder and remove everything below your top 2 samples to have a new FracLac input that matches. In the FracLac files you will probably need to edit your sample names to match the excel file names. For reference, my sample name in FracLac for the file from earlier is "Cold_F_MR284_s01tif0001-0628tifŞ1 (0,0_44x43)". If your samples have the "tif(green)thresholded" extension, try removing it to match up with mine.

Then try the R program with these new skeleton and fraclac inputs and see if they merge properly. If they do, I can help with fixing all the original files. If not, Jenn has to handle your issue haha.

AlexJTYang commented 6 months ago

hmm after trying that I'm running into another issue... whenever I modify those file names, the tidying functions for both fraclac and skeleton show a new warning (there is no longer an ID number being taken into the columns), and when I push through and run the merge regardless, there is still not data being brought into the single dataframe. As soon as I use the previous file names that warning disappears so I'm assuming this has something to do with the fraclac and skeletal analysis and how it generates the file names...

interestingly I tried doing a full_join and was able to generate a dataframe that contained all the data (of course names and IDs were not merged so its essentially just a giant table with doubled names/IDs), so my hunch is that something is wrong with the inner_join that might have something to do with the naming?

thanks so much for trying to help me quincy

clean up fraclac and skeleton datatsets separately

fraclac <- fraclac_tidying(fraclac.dir) Warning message: Expected 2 pieces. Missing pieces filled with NA in 2 rows [1, 2]. skeleton <- skeleton_tidying(skeleton.dir) Warning message: Expected 2 pieces. Missing pieces filled with NA in 2 rows [1, 2].

qpc444 commented 6 months ago

Can you post a screenshot of your FracLac Hull and Circle file contents?

Quincy

AlexJTYang commented 6 months ago

here are the original files image

and here are the new "trouble shooting" files image

Alex

qpc444 commented 6 months ago

Unfortunately I do not know where to go from here. I think Jenn will have to help you solve this one. You might be correct about the inner_join function, but I am not familiar with it to know what is happening.

Sorry I couldn't be more helpful.

Quincy

AlexJTYang commented 6 months ago

that is okay! thank you so much for trying to help either way!

Alex

AlexJTYang commented 6 months ago

just an update. I was messing around with the data and realized there is a naming issue between both the skeleton results files and the fralac outputs.

I had had space in between tif(green) in the skeleton results files but NOT the fraclac output so they were never being recognized as the same name and therefore not considered to be the same variable when trying to merge

so it was indeed a naming issue like you suggested. Problem solved!

thanks again Quincy,

Alex