ahmohamed / lipidr

Data Mining and Analysis of Lipidomics datasets in R
https://www.lipidr.org/
Other
27 stars 13 forks source link

Error in add_sample_annotation #23

Closed vastevenson closed 2 years ago

vastevenson commented 2 years ago

Hi,

I am trying to have lipidr accept my csv with the sample annotations, however I keep getting this error:

> export_annot_path = paste(dir_path,dir_name,"/species_norm_annotations.csv",sep="")
> d <- add_sample_annotation(d, export_annot_path)
Error in .check_sample_annotation(data, annot) : 
  All sample names must be in the first column or a column named "Sample"

I have made sure to call my first column: "Sample". Do I need to follow a specific name for each sample? Is there a way I can have lipidr tell me what sample names it currently has (or is expecting) from the experimental dataset that it took just prior to this?

Annotations csv that is throwing errors: species_norm_annotations.csv

Thanks for your help!

ahmohamed commented 2 years ago

The sample names in your column Sample have to exactly match their names in the LipidomicsExperiment object d (match is case sensitive ). You can see the sample names using colnames(d).

Looking at you file, I'm suspecting you have numeric samples names, which are usually problematic since R may change them (more here).

If that's the case, you can easily fix your sample names by prefixing a character. For example, if colnames(d) == 1:21, you can change that by colnames(d) = paste0("X", colnames(d)).

vastevenson commented 2 years ago

Thank you for providing this solution. Do you know if the add_sample_annotation() method can accept data frames directly (similar to as_lipidomics_experiment())? Or does it have to be in a .csv format?

ahmohamed commented 2 years ago

Yes it can. See here

vastevenson commented 2 years ago

I tried giving a data frame to the add_sample_annotation() method, however it is still throwing the same error. I made sure that my first column of the annotation df is exactly what the columns are from d. Here's the code I wrote:

samples_list = colnames(d)
cleaned_samples_list = samples_list[-1] # -1 will remove the first element from the vector, 1st elem is just the string: "lipids"
Sample = c(cleaned_samples_list)
# add the Sample column to the df as first column
annotations_df = data.frame(Sample, annotations_df)
# save the df to a csv 
sample_annot_path = paste(dir_path,dir_name,"/sample_annot_v2.csv",sep="")
write.csv(annotations_df, file=sample_annot_path, sep=",")

d <- add_sample_annotation(d, sample_annot_path)

Here's what the .csv file of the annotations df looks like: sample_annot_v2.csv

I have also tried: d <- add_sample_annotation(d, annotations_df), but I still get the message:

Error in .check_sample_annotation(data, annot) : 
  All sample names must be in the first column or a column named "Sample"

Here's a screenshot of the annotations df: image

Another thing to note is that when I try to plot_samples(d), I notice that 'lipids' is being treated as a Sample: image

Update: I managed to get add_sample_annotation() to accept my data frame by including an additional row for the empty 'lipids' row: image

Is it possible to delete the column called 'lipids' from d?

Do you have any recommendations on how to proceed?

Thank you for your help!