Closed GoogleCodeExporter closed 9 years ago
There seems to be a problem with the parsing of your omx file. Would it be
possible for you to make the omx file available to us so that we can do some
testing? You can either upload it here or send it via e-mail if you don't want
the data to be online.
Original comment by harald.b...@gmail.com
on 28 Sep 2011 at 11:03
ok
in attachement you'll find the omx file
Original comment by lia.s...@gmail.com
on 28 Sep 2011 at 11:44
Attachments:
Apparently OMSSA has issues parsing your FASTA file. For proteins accessions it
returns (see the omx file):
<MSPepHit_accession>sp|Q10MH8</MSPepHit_accession>
instead of:
<MSPepHit_accession>Q10MH8</MSPepHit_accession>
Peptide-Shaker thus tries to find the protein sp|Q10MH8 which does not exist :)
Which version of OMSSA are you using? How did you generate your FASTA file?
Could you send it to us?
Thank you for your help!
Original comment by mvau...@gmail.com
on 28 Sep 2011 at 12:59
ok, to generate my omx file, I use MSDA tool web site(Mass Spectrometry data
analysis) which include Omssa search engine.
Can I have you e-mail?
because the file is too big
Original comment by lia.s...@gmail.com
on 28 Sep 2011 at 1:27
I don't know the MSDA tool. Do you have a link?
To ensure compatability with PeptideShaker we recommend using SearchGUI to
execute the searches. SearchGUI can be found here:
http://searchgui.googlecode.com
But we'll look into supporting MSDA if possible.
Original comment by harald.b...@gmail.com
on 28 Sep 2011 at 1:43
here the link of my fasta file downloading in MSDA web site tool
http://dl.free.fr/hp1uknDV9
here the link of MSDA site web tool
https://msda.u-strasbg.fr/index.php
thanks
Original comment by lia.s...@gmail.com
on 28 Sep 2011 at 1:45
[deleted comment]
I just can send you this links where you could download the fasta file
http://dl.free.fr/hp1uknDV9
Original comment by lia.s...@gmail.com
on 28 Sep 2011 at 1:52
Hi again,
I had no problem running a search with your database.
It breaks my Alsatian heart to say so but the incompatibility seems to come
from MSDA. Actually the problem is the parsing of the FASTA file for which we
use makeblastdb (version 2.2.24+) as advised by the OMSSA developers (Harald
correct me if I'm wrong). I will ask them how they do it and will try to ensure
compatibility.
On the other hand I advice you to use SearchGUI (searchgui.googlecode.com) for
OMSSA. It will ensure compatibility with Peptide-Shaker and search with
X!Tandem in parallel. Hence you have two search engines for the price of one :)
SearchGUI is straightforward to handle and you will be able to run searches
locally without having to register and provide personal information.
Also, Peptide-Shaker is designed for concatenated Target/Decoy search results.
So you might want to use a concatenated Target/Decoy database which you can
generate in SearchGUI from your fasta file. The decoy hits will be used for the
calculation of confidence and FDR for your peptides and proteins.
Finally, your database contains all kind of species. The amount of
identifications at a defined quality level will thus be dramatically reduced,
the protein inference almost impossible and the search time will explode
exponentially. You might want to tailor your database to the taxonomy you need
(human?). This can be done by downloading the fasta files of the needed species
from the Uniprot website (uniprot.org). In case you end up with several fasta
files you can merge them using dbtoolkit (dbtoolkit.googlecode.com).
The current version of Peptide-Shaker does not support such large databases
(see bug report 1). In the new version which will be released soon, large
datasets and databases are better handled. The results will still be full of
false positives though.
If you have more questions related to protein identification we encourage you
to contact us via the Peptide-Shaker mailing list
(groups.google.com/group/peptide-shaker).
Original comment by mvau...@gmail.com
on 28 Sep 2011 at 2:47
Ok, I tried SearchGUI and I have a problem with Omssa installation
The error message :" Failed to start Omssa, maake sure that Omssa installed
correctly and that you have selected the correct version of OMSSA for your
system>
Original comment by lia.s...@gmail.com
on 28 Sep 2011 at 2:54
If you are using windows you should verify that the good version is selected
(32 or 64 bits). In case it does not work (typically for versions older than
windows 7) you might have to install missing libraries (see "OMSSA on Windows"
on the searchGUI webpage, troubleshooting section:
http://code.google.com/p/searchgui/#Troubleshooting).
Original comment by mvau...@gmail.com
on 28 Sep 2011 at 3:12
ok thanks I'll try
Original comment by lia.s...@gmail.com
on 28 Sep 2011 at 3:19
Dear all,
We allow ourselves to enter the discussion group as we have seen the report of
the issues related to MSDA and would just like to bring our comment on this
issue.
Indeed, the problem is related to the database formatting as we are using
formatdb instead of makeblastdb. This because OMSSA Browser and Scaffold show
problems when using makeblastdb.
So to ensure that this will not be a limitation to use msda in the future, we
will offer both possibilities soon so that people who want to visualize their
.omx files in PeptideShaker will be able to do so.
Also, as you are discussing database generation tools, our database generation
toolbox on msda includes all you would need to extract any taxonomies from
reference databases (NCBInr, UniProtKB, UniProtKB/Swiss-Prot)/ Add known
sequences/ Contaminants/ Decoys/ Merge databases/ Generate Fasta files...
https://msda.u-strasbg.fr/
This was for the short advertising part:-)
Best regards and don't hesitate to contact us
The "broken heart" alsatian team:-)
Original comment by alexandr...@gmail.com
on 4 Oct 2011 at 9:08
Thank you very much for your input. Actually we can solve this problem very
easily if we make our omssa results parser compatible with files generated
using formatdb. For this we only need to extract the accession number of the
protein of interest. How do you retrieve it usually?
In the omx file sent previously the accession line is:
<MSPepHit_accession>sp|P58047</MSPepHit_accession>
do you know how it would look like for other kind of databases?
Original comment by mvau...@gmail.com
on 4 Oct 2011 at 9:26
We usually use Scaffold to visualize our OMX files. Scaffold searches the
content of the MSPepHit_accession markups into the FASTA file, using regular
expressions to split the accession number from the description (and another one
to identify the decoy entries).
The accession line you pointed out is in fact the result of an old script we
used to simplify the parsing of the accession numbers for our research team.
This script have been removed from MSDA to ensure MSDA users that the OMX file
format is fully respected. The current output is now :
<MSPepHit_accession>P67779</MSPepHit_accession>
<MSPepHit_accession>REVERSED_109477550_XP_001070433</MSPepHit_accession>
Original comment by alexandr...@gmail.com
on 4 Oct 2011 at 1:46
Great, I will make sure that the next version of PeptideShaker (to be released
this month) handles this structure and decoy tag.
Original comment by mvau...@gmail.com
on 4 Oct 2011 at 2:12
I Try searchGUI to generate my omx file but when I put my omx file in
peptideShaker tool, it's still not working. I try with the same mascot dat file
version and it's work without problem.
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 8:21
Send me the files you use as input to PeptideShaker (omx, mgf and fasta) and
I'll run them through the new version of PeptideShaker and see if I can figure
out the problem.
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 8:54
ok,in attachement you'll find all ,
here the link to download the fasta file http://dl.free.fr/d1LogWLsc
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 9:33
Attachments:
Seems to work fine in the soon to be released new version of PeptideShaker.
I'll let you know as soon as the new version has been released so that you can
test it for yourself.
BTW, it's not recommended to use the whole of Swiss-Prot as the database. This
will result in matches to multiple organisms (human, mouse, bacteria etc, etc).
Something that doesn't make a lot of sense if you search with a human sample
for example...
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 10:55
I also noticed that you don't use a target-decoy database. This means that the
FDR-calculations (i.e., the protein validation) will be incorrect. This might
be the problem opening your files in the old PeptideShaker version. (Something
we fixed in the new version.)
You can easily add a decoy section to your database by clicking the "Decoy"
button in the "Parameters Editor" tab in SearchGUI. Note however that this will
take quite some time when used on your big FASTA file...
I'll test it and let you know.
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 11:02
Yes, I agree for the target decoy database, but I just try to use this tool
that's why I use this file and this database. When I use the mascot dat file
and the same database in peptide Shaker it work.
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 11:07
Target-decoy doesn't help in the old version either. And takes a very long time
due to the increase in database size...
However, I am able to open your files both with and without the decoy section.
I do get an error (related to the validation plots, which will be empty), but
if I close that dialog I can interact with the data. Is this the case for you
as well?
Could you send me the PeptideShaker.log file in your conf folder?
Don't know why this is a problem for OMSSA files and not for Mascot files
though. But as I mentioned above this has been fixed in the new version of
PeptideShaker.
So I'll wait until that's available for you to test. Hopefully later this week,
or early next week.
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 11:52
ok, In attachement you'll find the peptideShaker.log file
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 12:00
Attachments:
Lots of errors there... Try deleting the log file and re-run PeptideShaker on
the files causing you issues. Then send me the new log file.
It also seems to run out of memory. You could try to increase the max memory
settings for PeptideShaker as well. This is done in the file 'JavaOptions.txt'
in the conf folder. Increase the -Xmx1500M to -Xmx2500M if you have enough
memory for that.
All of this will be simpler in the new version which requires less memory.
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 12:13
here the file
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 12:32
Attachments:
This seems to be the exact same log file? Did you delete it an re-run
PeptideShaker?
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 12:37
sorry, here the file
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 12:53
Attachments:
There are no errors in this log file. So what extually happens when you try to
open your files?
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 12:58
when I try to open my file, the tool bug and I have to close it
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 1:00
You mean that it freezes and that you can no longer interact with it? If so
could you send me a screenshot of the tool when this happens?
This is also most likely related to memory issues. And you could try increasing
the memory settings as I explained above and see if that helps. Or did you
already try this?
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 1:03
Yes, iT freezes, the programm is not responding, I try to increase to -Xmx2500M
but I can't open the tool.
Original comment by lia.s...@gmail.com
on 12 Oct 2011 at 1:12
Attachments:
And the same thing does not happen when using a Mascot dat file? Strange.
Anyway, let's just wait until the new PeptideShaker version is released.
The new version is also more memory efficient, so it shouldn't matter that you
cannot set the memory to 2.5 GB.
Original comment by harald.b...@gmail.com
on 12 Oct 2011 at 2:36
yes, when I use a mascot dat files there's no problem
Original comment by lia.s...@gmail.com
on 13 Oct 2011 at 12:19
PeptideShaker v0.10.0 has just been released. Please let us know if this solves
your issues or not.
Original comment by harald.b...@gmail.com
on 19 Oct 2011 at 3:28
Hi, I try with my omx file wich are come from MSDA tool:
here the result
Fri Oct 21 15:01:54 CEST 2011 Importing sequences from
uniprot_sprot_2011_08.fasta.
Fri Oct 21 15:02:11 CEST 2011 FASTA file import completed.
Fri Oct 21 15:02:11 CEST 2011 Reading identification files.
Fri Oct 21 15:02:11 CEST 2011 Reading file:
OlLA110504_albu-B-mod_264_1-2.omx
Fri Oct 21 15:02:17 CEST 2011 Identification file(s) import completed.
557 identifications imported, 95 identifications retained.
Fri Oct 21 15:02:17 CEST 2011 Computing assumptions probabilities.
Fri Oct 21 15:02:17 CEST 2011 Adding assumptions probabilities.
Fri Oct 21 15:02:17 CEST 2011 Selecting best hit per spectrum.
Fri Oct 21 15:02:17 CEST 2011 Generating PSM map.
Fri Oct 21 15:02:17 CEST 2011 Computing PSM probabilities.
Fri Oct 21 15:02:17 CEST 2011 Computing peptide probabilities.
Fri Oct 21 15:02:17 CEST 2011 Scoring PTMs.
Fri Oct 21 15:02:17 CEST 2011 An error occurred while working on the
identification. See the log file for more details.
Fri Oct 21 15:02:17 CEST 2011 Trying to resolve protein inference issues.
Fri Oct 21 15:02:17 CEST 2011 An error occured while loading the
identification files:
Fri Oct 21 15:02:17 CEST 2011 null
Fri Oct 21 15:02:17 CEST 2011 Import canceled.
When I try, with the xml file from searchgui, there's no problem (X-tandem
search), because It still not working with Omssa and I don't understand why.
With my mascot dat file, it amazing, it's not working at all
Original comment by lia.s...@gmail.com
on 21 Oct 2011 at 1:16
OK, so X!Tandem works, that's good.
Probably just some minor detail for the other two search engines.
Just detected a Mascot issue that might help you as well. Mascot peptides
sometimes contains the non-standard amino acids B, Z and X. These were not
supported in PeptideShaker, but have now been added.
But Mascot used to work for you before right? I did update the Mascot parser
library, maybe that's it.
Could you send me your new log file (from the new PeptideShaker version)?
And are the input files (omx/dat, mgf and fasta) the same as before?
Original comment by harald.b...@gmail.com
on 21 Oct 2011 at 2:28
Yes I used the same omx/dat file. with the older peptide Shaker version, I
don't have problem with my mascot dat file.
In attachement you'll find my log file
Original comment by lia.s...@gmail.com
on 21 Oct 2011 at 2:57
Attachments:
I cannot find your dat file. Could you send that to me as well?
Original comment by harald.b...@gmail.com
on 21 Oct 2011 at 3:10
From your log file I see the following "Protein not found: IGHG3_HUMAN". This
means that the protein was not found in your database. Which makes perfect
sense given that "IGHG3_HUMAN" is not a protein accession number but rather a
protein name. The accession number for this protein is "P01871". So this means
that something went wrong in the database parsing. Does this happen for the omx
or the dat file?
I also see some problems when trying to estimate a peptide's theoretic mass.
This looks similar to the problem I mentioned above for the Mascot special
amino acids B, Z and X. I've released a new version of PeptideShaker that
supports B, Z and X. Maybe you could give that a try and see if you know can
open your Mascot file again?
Original comment by harald.b...@gmail.com
on 21 Oct 2011 at 3:44
I try with the new version and no result, I have the same problem. but I don't
understand why it ws working with the mascot .dat file and now it's not
working. It still work with the file wich are come from searchgui, but no with
my omx file.
In attachement you'll find my log file
Original comment by lia.s...@gmail.com
on 24 Oct 2011 at 7:57
Attachments:
The reason for the Mascot error has been detected: "Unknown amino acid: U!". In
the new version we try to re-calculate the theoretical peptide mass (something
we didn't do before), and the special amino acids in Mascot (B, Z, X and now U)
results in issues. We'll fix this and release a new version later today.
So the omx file you're now using comes from MSDA and not from SearchGUI? Then
there is still something wrong with the way the FASTA file is parsed in MSDA as
"IGHG1_HUMAN" is not a protein accession number. But we'll look into it and
perhaps contact the MSDA developers.
Original comment by harald.b...@gmail.com
on 24 Oct 2011 at 8:15
ok thanks
Original comment by lia.s...@gmail.com
on 24 Oct 2011 at 8:45
PeptideShaker version 0.10.3 has just been released. It supports the
Selenocysteine amino acid (the U) causing issues the last time around.
Hopefulle this means that Mascot should work again.
If not, please send me the dat file so that I can test it.
As for the omx file, the omx file you sent us earlier had other issues than
what you now report, so if you could send the omx file you are using as well
that would help a lot. (As it does not seem to be the exact same file as
before..?)
Original comment by harald.b...@gmail.com
on 24 Oct 2011 at 2:29
in attachement you'll find my mascot .dat file
Original comment by lia.s...@gmail.com
on 24 Oct 2011 at 2:56
Attachments:
Does this mean that it is still not working with the Mascot file?
Original comment by harald.b...@gmail.com
on 24 Oct 2011 at 2:58
yes it is!!!
Original comment by lia.s...@gmail.com
on 24 Oct 2011 at 2:59
Ok, I can confirm that it is possible to open your dat file in the old version,
but I do get a 'protein not found exception': "Protein not found! Accession:
UBP29_HUMAN". So this means that the parsing of the FASTA files on your Mascot
server is not set up correctly.
Which is the same problem you get with the new verison (with a different
protein though): "Protein not found: IGLL5_HUMAN." The only difference that I
can see is that the new version stops you from continuing (given that you have
unknown proteins) where as the old version allowed you to continue anyway.
Please refer to http://www.matrixscience.com/help/seq_db_setup.html for
database setup in Mascot. With the correct parsing rules you should be able to
load your dat file into PeptideShaker without any issues.
When the parsing rules are set up correctly they should return only the protein
accession number in the 'Accessions' column.
If you need help finding the correct parsing rules let me know and I'll send
them to you. (Don't have them available right now.)
Original comment by harald.b...@gmail.com
on 24 Oct 2011 at 3:18
Regarding your omx file, have you remade this like suggested by the MSDA team?
(see: http://code.google.com/p/peptide-shaker/issues/detail?id=2#c15)
If not you will still get the problem that the accession numbers in your omx
file are like this 'sp|P02768' and not like this 'P02768' as they should be.
And therefore not compatible with PeptideShaker.
Original comment by harald.b...@gmail.com
on 24 Oct 2011 at 4:10
As the issues seem to be related to the parsing of accession numbers on either
the Mascot server or MSDA (used for the OMSSA search), and therefore does not
require further changes to the PeptideShaker code, I'm setting this issue as
Fixed.
Please see the Read Me (http://code.google.com/p/peptide-shaker/#Read_Me) or
the new Database Help (http://code.google.com/p/searchgui/wiki/DatabaseHelp)
for further details.
Original comment by harald.b...@gmail.com
on 22 Dec 2011 at 1:59
Original issue reported on code.google.com by
lia.s...@gmail.com
on 28 Sep 2011 at 9:57Attachments: