reneshbedre / bioinfokit

Bioinformatics data analysis and visualization toolkit
MIT License
333 stars 77 forks source link

AssertionError: either significant or non-significant genes are missing; try to change lfc_thr or pv_thr to include both significant and non-significant genes #31

Closed arturomarin closed 3 years ago

arturomarin commented 3 years ago

Hi,

I am trying to do a volcano plot labelling all the DEGs with the next code

from bioinfokit import analys, visuz
import pandas as pd

csv_list = ["Def_results_Day2_mock_vs_Day2_rdORF6", \
"Def_results_Day2_mock_vs_Day2_rWT", \
"Def_results_Day2_mock_vs_Day4_rdORF6", \
"Def_results_Day2_mock_vs_Day4_rWT", \
"Def_results_Day2_mock_vs_Day6_rdORF6", \
"Def_results_Day2_mock_vs_Day6_rWT", \
"Def_results_Day2_rWT_vs_Day2_rdORF6", \
"Def_results_Day4_rWT_vs_Day4_rdORF6", \
"Def_results_Day6_rWT_vs_Day6_rdORF6"]

for x in csv_list:

    folder = "../deseq2/"
    path = folder + x + ".csv"

    df = pd.read_csv(path)

    visuz.gene_exp.volcano(df=df, geneid='gene_name', lfc='log2FoldChange', lfc_thr=(-6, 6), pv_thr=(1, 0), genenames='deg', pv='PValue', figname=x)

but it give me the next error:

Traceback (most recent call last):
  File "volcanoplot_script_v0.3.py", line 40, in <module>
    visuz.gene_exp.volcano(df=df, geneid='gene_name', lfc='log2FoldChange', lfc_thr=(-6, 6), pv_thr=(1, 0), genenames='deg', pv='PValue', figname=x)
  File "/Users/arturo/miniconda3/envs/rnaseq/lib/python3.6/site-packages/bioinfokit/visuz.py", line 405, in volcano
    'either significant or non-significant genes are missing; try to change lfc_thr or pv_thr to include ' \
AssertionError: either significant or non-significant genes are missing; try to change lfc_thr or pv_thr to include both significant and non-significant genes

I did some test with other values of lfc_thr and pv_thr, but it give me the same error. I add an example of the volcano plot in which I want to put the labels of the DEGs. Can you help me? Def_results_Day2_mock_vs_Day2_rdORF6

reneshbedre commented 3 years ago

Your log fold change threshold is too high and some of your samples may not have the genes with that threshold. Try to lower the log fold change and run again. Let me know if issue persist.