Manual assignment of 'colData@listData[["group"]]

rsuseno2907 commented 10 months ago

Creating a PhIP object and running BEER using the following lines gave me a Error: logical subscript contains NAs:

tmp_phip <- PhIPData(counts = tmp_merged, metadata = tmp_metadata) 
beer_out <- brew(tmp_phip, assay.names = assay_locations)

I looked into the tmp_phip object and found that the tmp_phip@colData@listData[["group"]] field seems to be filled with NAs. Once I manually assign this field to a list that's appropriate to my data, running BEER gave me a different issue, but I think that's a discussion for another day.

Is it an expected step to manually assign the groups? I made sure to check that none of my counts data or sample metadata has no NAs - and the manual assignment verified that the Error above came from that "group" field.

Note: In the metadata field of the PhIP object, the group seems to be properly assigned.

athchen commented 9 months ago

Just to make sure I understand, when you create the tmp_phip object, there are no empty values and there are no errors when creating the object. But when you run brew(), you encounter an error related to the group column of the metadata?

Can you give me the output to the following commands?

tmp_phip
sampleInfo(tmp_phip)
getBeadsName()

rsuseno2907 commented 8 months ago

Correct, there are no empty values in the data (the matrix is mostly filled zero). When creating the object, a warning popped up: ! Missing peptide start and end position information. Replacing missing values with 0.

The output of tmp_phip:

> tmp_phip
class: PhIPData 
dim: 10 3 
metadata(2): sample_name group
assays(3): counts logfc prob
rownames(10): gi|10047090|ref|NP_055147.1|_small_muscular_protein_[Homo_sapiens]_fragment_0
  gi|10047090|ref|NP_055147.1|_small_muscular_protein_[Homo_sapiens]_fragment_1 ...
  gi|10047102|ref|NP_057388.1|_probable_ribosome_biogenesis_protein_RLP24_[Homo_sapiens]_fragment_4
  gi|10047102|ref|NP_057388.1|_probable_ribosome_biogenesis_protein_RLP24_[Homo_sapiens]_fragment_5
rowData names(0):
colnames(3): PEP43_Plate3_GFAP3_S265_L001_R1_001 PEP43_Plate3_GFAP2_S230_L001_R1_001 PEP43_Plate3_BEAD3_S278_L001_R1_001
colData names(1): group
beads-only name(0): beads

The output of sampleInfo(tmp_phip):

DataFrame with 3 rows and 1 column
                                        group
                                    <logical>
PEP43_Plate3_GFAP3_S265_L001_R1_001        NA
PEP43_Plate3_GFAP2_S230_L001_R1_001        NA
PEP43_Plate3_BEAD3_S278_L001_R1_001        NA

The output of getBeadsName() is "beads"

In this run, I'm heavily subsetting my data to only 3 samples and 10 peptides to ensure that there's indeed no empty values in my counts matrix, tmp_merged.

Thank you!

athchen commented 8 months ago

To run BEER you have to provide sample information -- it infers the groups of the samples based on that column in order to run a sample vs. all beads-only comparison. Based on your sampleInfo(tmp_phip) output, this column is missing the necessary information.

rsuseno2907 commented 8 months ago

Thanks for the clarification! I've filled them out and my sampleInfo(tmp_phip) now looks like the following:

DataFrame with 3 rows and 1 column
                                          group
                                    <character>
PEP43_Plate3_GFAP3_S265_L001_R1_001        gfap
PEP43_Plate3_GFAP2_S230_L001_R1_001        gfap
PEP43_Plate3_BEAD3_S278_L001_R1_001       beads

I tried running brew() again but ran into the following error message

Error in if (a_est < lower) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group,  :
  There is no replication, setting dispersion to NA.
Error during wrapup: 'length = 38' in coercion to 'logical(1)'
Error: no more error handlers available (recursive errors?); invoking 'abort' restart

Do you possibly know what might be causing the error? Am I still missing any information needed to create the appropriate PhIPData object?

athchen commented 8 months ago

Since you only have one beads-only sample, it cannot estimate the variance in the proportion of reads pulled in the beads-only samples by each peptide. You need to have more than one beads-only sample.

rsuseno2907 commented 8 months ago

Understood. Thank you for addressing my questions! Closing the issue now.

athchen / beer

Manual assignment of 'colData@listData[["group"]] #7