Closed yeroslaviz closed 1 year ago
A second warning I'm getting appears, when running the differential expression step
here it warns about not being able to compute exact p-values with ties. What causes this warning?
P.S. Is there a way t reduce the huge amounts of warnings. One time would be enough
May you share your R object and codes used that generated these warnings via Google Drive please?
This strange, I have rerun the same command again, without changing anything, but now I don't get this warnings.
If you're still interested, i think I can provide you with the R object you asked for, but the problem seems to have vanished.
thanks anyway.
I have a different question now. When running the Differential expression analysis for both genes and splicing events I get a huge table with a lot of duplicated entries.
> head(marvel$DE$Exp.Spliced$Table)
gene_id gene_name gene_type n.cells.g1 n.cells.g2 mean.g1
1 ENSMUSG00000095595 Fam177a protein_coding 23 24 5.430945
2 ENSMUSG00000095595 Fam177a protein_coding 23 24 5.430945
3 ENSMUSG00000095595 Fam177a protein_coding 23 24 5.430945
4 ENSMUSG00000095595 Fam177a protein_coding 23 24 5.430945
5 ENSMUSG00000095595 Fam177a protein_coding 23 24 5.430945
6 ENSMUSG00000095595 Fam177a protein_coding 23 24 5.430945
mean.g2 log2fc statistic p.val p.val.adj
1 4.905805 -0.5251404 NA 8.866898e-08 0.0004580672
2 4.905805 -0.5251404 NA 8.866898e-08 0.0004580672
3 4.905805 -0.5251404 NA 8.866898e-08 0.0004580672
4 4.905805 -0.5251404 NA 8.866898e-08 0.0004580672
5 4.905805 -0.5251404 NA 8.866898e-08 0.0004580672
6 4.905805 -0.5251404 NA 8.866898e-08 0.0004580672
Is this deliberate? I can filter all these duplication, but it makes the R object really big and the session very slow.
When extracting the complete table with all the entries the file size is 49Gb, when removing all duplicated rows, I am left with 2.9Mb.
I'm attaching the script I'm using here, but it is mostly a copy of your plate-pipeline web page.
I have here the links to two different marvel objects.
The first one was created as was stated in the pipeline -
126Mb in size
The second one was made after the differential analysis on the gene level - 700Mb in size
Let me know if you need anything else
thanks
Please may you share your R object after at line 783: save(marvel, file=paste(path, file, sep="")) of sharedScript.Rmd?
I have shared it. it is in the first link I added above.
here it is again - https://datashare.biochem.mpg.de/s/I7QxDqxqFRnC3H4
The next error appears when running the IsoSwitch
command when trying to assign dynamics to gene splicing events.
Here, I get this error:
Error in `$<-.data.frame`(`*tmp*`, "cor", value = NA) :
replacement has 1 row, data has 0
Can you please tell me how to interpret this one? Do I have something missing in the data, or don't I have any significant events?
This strange, I have rerun the same command without changing anything, but now I don't get this warnings.
If you're still interested, i think I can provide you with the R object you asked for, but the problem seems to have vanished.
thanks anyway.
I have re-run the command again and I got all the NaNs warnings
again.
Any ideas why this is happening?
thanks again
Did you have time to look at the strange behaviour of having many many duplicated entries in the marvel
object?
This happens already, when running the CompareValues
command:
> marvel$DE$Exp$Table[1:5, ]
gene_id gene_name gene_type n.cells.g1 n.cells.g2 mean.g1 mean.g2
1 ENSMUSG00000110896 Gm48273 TEC 23 24 5.55221 7.396245
2 ENSMUSG00000110896 Gm48273 TEC 23 24 5.55221 7.396245
3 ENSMUSG00000110896 Gm48273 TEC 23 24 5.55221 7.396245
4 ENSMUSG00000110896 Gm48273 TEC 23 24 5.55221 7.396245
5 ENSMUSG00000110896 Gm48273 TEC 23 24 5.55221 7.396245
log2fc statistic p.val p.val.adj
1 1.844035 NA 8.620796e-12 4.13087e-06
2 1.844035 NA 8.620796e-12 4.13087e-06
3 1.844035 NA 8.620796e-12 4.13087e-06
4 1.844035 NA 8.620796e-12 4.13087e-06
5 1.844035 NA 8.620796e-12 4.13087e-06
Many apologies for the late reply, I was trying to get several conference abstracts off my desk!
Please may you re-share your R object and scripts? I will look into this now.
No worries about it, I guess you're as busy as I am. I know how it feels.
I have solved the duplication problem. For some reason my GeneFeature matrix contained duplicated entries, which were carried further downstream in the analysis.
But now I'm stuck in an error I can't figure, when calculating the gene splicing dynamics.
I can run the IsoSwitch command and get the results
cor freq pct
1 Coordinated 60 3.4149118
2 Opposing 49 2.7888446
3 Iso-Switch 1632 92.8856005
4 Complex 16 0.9106431
but when I try to plot them I get the following error:
> marvel <- PlotValues(MarvelObject=marvel,
+ cell.group.list=cell.group.list,
+ feature=gene_id,
+ maintitle="gene_name",
+ xlabels.size=7,
+ level="gene"
+ )
Error in xj[i] : invalid subscript type 'list'
The link to the R object after running the IsoSwitch()
command is below (size ~12Gb):
https://datashare.biochem.mpg.de/s/mXN6GprNpQHChUw
thanks for the help
Great that you managed to solve the gene duplication issue!
Please may you paste the section of your scripts here that defines your cell.group.list object and gene_id object? I suspect the issue lies in defining the cell groups to plot in cell.group.list.
I think I solve this problem as well.
for some reason the marvel$GeneFeatures
subset was saved as a tibble
and not as a data.frame
. For that reason the gene_id
object was empty. This caused this error message.
Thanks for the patience and the help. I'll close this ticket and reopn it (or a new one if needed).
Assa
Hi,
thanks for this great tool. It took a while to create the
marvel
obeject, but I managed to do so. Now, when trying to assign modalities, I get the warningWarning: NaNs produced
many many times.I know it is just a warning, but i want to make sure, all is good in my data and the way I created the
marvel
object. Also I want to understand why this warning appears and where are theNaNs
are coming from.thanks in advance for the info
Assa