LauraMCE / lncRNA_BC

It is a repository that contains information about my master's project. The main topic is lincRNA as biomarkers in breast cancer. The main objective is to identificate lincRNA biomarkers by transcriptome analysis
0 stars 6 forks source link

Code color in graphs #8

Closed LauraMCE closed 5 years ago

LauraMCE commented 5 years ago

Help ith my advisor's issue

Hi! My advisor ask me to do a very difficult task with my paper's pictures. Here is literally what he said:

There are some problems with the code colors in the volcano plot and the PCA plot. Firstable, the volcano plot does not differ between sub and over expressed genes in colors: both have the same color. Second, there is no difference in color between significant differentially expressed genes and non significant (all are in black dots). Please check that.Second: We had followed the code colors in other graphs for Resistant and sensitive patients (Blue cyan and red, respectively). In your graph they are inverted. Please correct. Thanks!

So, I hope that you can help me with the 2 issues.

1.- Volcano Plot

This is the code I'm using for Volcano plot.

##Indicate color code##

cols[BCresultsNR$log2FoldChange < -1.5] <- "#0066FF"
cols[BCresultsNR$log2FoldChange > 1.5] <- "#0033CC"
cols[BCresultsNR$pvalue == 0] <- "black"
cols[BCresultsNR$sig < -log10(alpha) ] <- "#000033"
cols[BCresultsNR$pvalue > 0.05] <- "#CCCCCC"

And my problem is that I couldn't have a color code with 6 colors. Please Help!

2.- PCA plot (SOLVED!!)

I had problems with PCA too. But I solved it. Here is my first script, that does not work with what I want, and here is my solution. Thanks!!

camillethuyentruong commented 5 years ago

Hi Laura, I can't access the data of the graphic 1) you need to save it into .csv 2) you need to put it into the data folder 3) In your R code you need to import these data

cristoichkov commented 5 years ago

Hi laura

To import .csv files you have to do this

name_df <- read.csv("name_your_file.csv")

camillethuyentruong commented 5 years ago

Hi Cris,

The command to export a table is: write.csv(table_in_R$columns, "name.csv")

LauraMCE commented 5 years ago

Hi! I've already uploaded the .csv file

and also I corrected the script. Thank you!!

VeroIarrachtai commented 5 years ago

May be you can check this if you have header and others considers: ERROR: Nine number changed by ) read.table(file, header = FALSE, sep = "", quote = ""'"9 TRUE: read.table(file, header = FALSE, sep = "", quote = ""'")

LauraMCE commented 5 years ago

Hi! I've indicated the header but I don't know how to indicate rownames... help!

camillethuyentruong commented 5 years ago

https://github.com/opetchey/RREEBES/wiki/Reading-data-and-code-from-an-online-github-repository

How to upload a file in R from Github

LauraMCE commented 5 years ago

Hi! I added a line in the script, VPSol <- data.frame(BCresultsNR$log2FoldChange, BCresultsNR$sig, row.names = rownames(BCresultsNR)) ##Creates a data frame with coordinates## This generates a data frame with coordinates data. For colors, I would like to categorize data following this:

Color 1: $log2FC > 1.5, $sig <0.05 Color 2: $log2FC < -1.5, $sig <0.05 Color 3: $log2FC > 1.5, $sig > 0.05 Color 4: $log2FC < -1.5, $sig >0.05 Color 5: 0 < $log2FC< 1.5, $sig > 0.05 Color 6: -1.5 < $log2FC < 0 Thanks!!!

cristoichkov commented 5 years ago

Hi Laura

I have the answer, you have to do this:

##Rename the columns to simplify the dataframe##

VPSol <- VPSol %>% rename(FoldChange = BCresultsNR.log2FoldChange, p_value = BCresultsNR.sig)

##Create a column with colors depending on the value of Fold Change and p-value##  

VPSol$color <- ifelse((VPSol$FoldChange > 1.5) & (VPSol$p_value < 0.05), "Col_1",
                      ifelse((VPSol$FoldChange < -1.5) & (VPSol$p_value < 0.05), "Col_2",
                             ifelse((VPSol$FoldChange > 1.5) & (VPSol$p_value > 0.05), "Col_3",
                                    ifelse((VPSol$FoldChange < -1.5) & (VPSol$p_value > 0.05), "Col_4",
                                           ifelse((VPSol$FoldChange < 1.5) & (VPSol$p_value > 0.05), "Col_5", "Col_6")))))

##Create plot##
ggplot(VPSol, aes(x=FoldChange, y=p_value)) +
  geom_point(aes(colour = color)) 

I can run it on my computer, I hope you can also run it. In ggplot we can put the lines to divide the grid, but I don't know how to you want it.

camillethuyentruong commented 5 years ago

Hi Laura

I finally found a way to upload the dataframe in R from GitHub!

# upload the csv file
# the first column need to have a name - I modified the csv file by adding "sample" on the first column 
# the link need to be the raw data by clicking "view raw"

library (readr)
urlfile="https://raw.githubusercontent.com/LauraMCE/lncRNA_BC/master/Transcriptome/Data_1_Diff_genes.csv"
BCData<-read_csv(url(urlfile))

# transform the table into a dataframe, indicating that the first column is the row names
BCData <- as.data.frame(BCData)
rownames(BCData) <- BCData[, 1]
BCData <- BCData[, -1]

#verify that the data are correct
head(BCData)
row.names(BCData)
colnames(BCData)

It works on my computer. Try on your own and if it works, modify the code in Github so that everyone can access the data and make the graphic.

Good job everyone, we are getting close!

camillethuyentruong commented 5 years ago

Hi Laura, Don't forget to reply and if no further issues occur, close the issue!

LauraMCE commented 5 years ago

Hi, @camillethuyentruong ! The code you provided me worked perfectly, and it is already uploaded in the code. Thanks! Hi, @cristoichkov ! Thanks for the script. I did some changes that you can see here, but... It worked!! I have now a very nice Volcano Plot with a lot of colors. Thanks everybody @VeroIarrachtai @camillethuyentruong for helping me!! VP