soedinglab / WIsH

Predict prokaryotic host for phage metagenomic sequences
GNU General Public License v3.0
50 stars 10 forks source link

3 - Advanced analysis - empty p value matrix #7

Closed MeggyC closed 3 years ago

MeggyC commented 5 years ago

Hey there,

I ran WiSH on 191 viral contigs that I have alongside 712 genome bins that I generated. To get the associated p-values I ran the command: WIsH -c predict -g "/srv/home/s4549287/27052019_nucleotide_seqs_df50_splitpercontig/" -m modelDir -r "/srv/home/s4549287/WIsH/output17062019_5" -b -p -n KeggGaussianFits.tsv &> /srv/home/s4549287/WIsH/output17062019_5/lpredict_log.log

However, the pvalues matrix that was returned featured only NAs. How can I generate some p values?

ClovisG commented 5 years ago

Hi ! Most likely is that the identifiers of your models do not match the identifiers in the KeggGaussianFits.tsv file that sets the parameters of the null models. Using this file for the parameters of the null models means that you are using Kegg genomes as models.

Assuming that you are using a custom database of genomes, you should parameterize the null models using the procedure described in https://github.com/soedinglab/WIsH#getting-the-null-paramters-for-new-bacterial-models

Does that solve your problem?

Best, Clovis.

rotoscan commented 3 years ago

Hello there,

thank you Clovis for developing this interesting tool.

I would like to obtain the p-values of for the hosting a set of viruses in a set of prokaryotic genomes and according to your protocol I need yo build a null-model with the virus known NOT to infect my prokaryotic genomes.

How does one can know that? It feels like a paradox problem to me. I want to know which viruses are infecting and I need to know which don't?

Could you please elaborate how exactly I can achieve this information? Or should I simply build the null-model with the viral pool inside NCBI's refseq?

Any help is appreciated!

THanks! BEst ROdolfo

ClovisG commented 3 years ago

Dear Rodolfo, Indeed, you can refer to the second part of [ https://github.com/soedinglab/WIsH#getting-the-null-paramters-for-new-bacterial-models | this ] part of the documentation, where it says: "However, a simpler but less accurate method can be used to get all the parameters at once or also when lack of the taxonomy of your bacterial models. In that case, you have to collect a set of phage contigs/genomes such that for every bacterial model B_i the size of the set of phages P_i that infects B_i can be neglected compared to the set of phages that does not infect B_i. Then you can get the parameters for each model at once by running the prediction: ./WIsH -c predict -g superDiverseSetOfPhageContigs -m newModelsDir -r outputNullModelResultDir -b

where newModelsDir contains all your bacterial model that are missing their null-model parameters. Then, you can use the script computeNullParameters.R that takes the predictions and create a file containing the null parameters for every bacterial model in newModelsDir.

Afterwards, to get the p-values while predicting interactions, please specify the options "-b -n nullParameters.tsv" in a prediction call."

This method also applies in your case.

Hope that helps, Best, Clovis.

| De: "Rodolfo Toscan" @.> | À: "soedinglab/WIsH" @.> | Cc: "Clovis Galiez" @.>, "State change" | @.> | Envoyé: Mardi 16 Mars 2021 09:45:19 | Objet: Re: [soedinglab/WIsH] 3 - Advanced analysis - empty p value matrix (#7)

| Hello there,

| thank you Clovis for developing this interesting tool.

| I would like to obtain the p-values of for the hosting a set of viruses in a set | of prokaryotic genomes and according to your protocol I need yo build a | null-model with the virus known NOT to infect my prokaryotic genomes.

| How does one can know that? It feels like a paradox problem to me. I want to | know which viruses are infecting and I need to know which don't?

| Could you please elaborate how exactly I can achieve this information? | Or should I simply build the null-model with the viral pool inside NCBI's | refseq?

| Any help is appreciated!

| THanks! | BEst | ROdolfo

| — | You are receiving this because you modified the open/close state. | Reply to this email directly, [ | https://github.com/soedinglab/WIsH/issues/7#issuecomment-800070700 | view it on | GitHub ] , or [ | https://github.com/notifications/unsubscribe-auth/AECKWMGREMLMWQ6ZRLXYETLTD4LB7ANCNFSM4HY3PGTQ | | unsubscribe ] .