Closed boyangzhao closed 3 years ago
Thank you for interest in the software.
The warning is unrelated to this issue you have raised, but I agree that those tools should not be checked in this function and I have just removed the check on main
.
Can you please try our test case?
library(magrittr)
library(data.table)
library(antigen.garnish)
v <- c("SIINFEKL", "ILAKFLHWL", "GILGFVFTL")
v %>% foreignness_score(db = "mouse") %>% print
I believe the software is working correctly. Peptides that return scores of NA are dropped, and SIINFEKL, for instance, would not be expected to return a score in humans.
@boyangzhao
Thanks, if I run this test case within the docker (docker into bash), I confirm it's working, with three columns (nmer, foreigness_score, and IEDB_anno) and the three peptides. However, if I were to run this locally (using the latest antigen.garnish and installation of blast), or using the docker but called via a cwl-engine, the result I get is
Generating FASTA to query.
Running blastp for homology to IEDB antigens.
Removing temporary fasta files.
nmer
1: SIINFEKL
2: ILAKFLHWL
3: GILGFVFTL
There is no warning/error messages so not sure if something else is going on that it prevents it from outputting the other columns? It looks like it's missing the Summing IEDB local alignments...
step
Sorry this is still causing trouble. @leeprichman and I were just chatting about this. This code path can exit here with a warning, here with a warning, or here with an error due to the column length being wrong. None of which seems to be occurring for you.
Is it possible a warning is being suppressed? On local, could you please
debug(foreignness_score)
and then step through the function? Please paste the output here.
I am not sure what is going on under cwl-engine but let's start with the simpler case.
Thanks for the speedy response. I've tried the debug, and figured out that I didn't download your http://get.rech.io/antigen.garnish-2.2.0.tar.gz
to install the BLAST databases. I downloaded that and defined AG_DATA_DIR, it's now working!
It was strange that the warning about BLAST database cannot be found was never displayed.
Yea that is odd. I'm not sure. I am glad it is working. Please re-open if you run into other issues!
When you mentioned Peptides that return scores of NA are dropped,
, in what instances would the result be NA, for foreignness and dissimilarity? And how should we interpret this?
One last thing, FYI, for the errors related to cwl-engine, the dockers are run as non-root, while the Docker made available was created with root and would face permission issues in accessing the /root/antigen.garnish folder. I found a workaround in the meantime. Thanks! and yes the issue is considered resolved.
Hi Boyang! If you are directly passing a vector of peptides to foreignness_score or dissimilarity, if a peptide does not return, it is because that sequence did not have any suitable blast alignments. In both cases, this is equivalent to a score of 0.
When run as part of the whole prediction function, this would get merged back to the table so the peptide would not be lost and NAs converted to 0s. Does that make sense?
Ok! Make sense. Thanks!
Actually, sorry Lee, one last question regarding the NAs. For foreignness I get, but isn't it for dissimilarity, if the peptide does not have any suitable blast alignments, that it would be so dissimilar that it would then have a score of 1? e.g. otherwise an out of frame indel generate novel peptides that wouldn't match any reference proteome would have a dissimilarity of 0? In cases of poor alignments, that I get, resulting in a higher dissimilarity.
No problem! Yes, thats a very astute observation. That said, this shouldn't happen in the setting of SNVs because the entire rest of the sequence should align. It's possible that a non-human source or frameshift could create a sequence with no alignments however, given the size of the blast database and the permissibility of the blast parameters, I have not yet seen this happen. It's also possible that a low complexity sequence such as AAAAAAAAA could fall in this category, but such sequences would be unlikely to be immunogenic. Essentially, this is an unvalidated edge case and we erred on the side of not calling them dissimilar.
I'm running your docker and trying to just run the foreignness and dissimilarity given a list of sequences, but seems to be getting warnings (or errors). I'm expecting just a table as output, but doesn't appear to be so.
Docker used: andrewrech/antigen.garnish:latest After starting up the docker,
docker run -it andrewrech/antigen.garnish /bin/bash
and startR
, when I run follow the example in the readme,The response I get is,
If I run
result <- foreignness_score(v, db = "human")
, the result looks like,instead of a table of two columns (nmer and foreignness_score).
I don't wish to run with vcf and predict binding. If I just want to run the foreignness/dissimilarity scores, do I still need to install all the netMHC tools? Does it resolve the issue above?