vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
283 stars 53 forks source link

The output file generated is only 1KB in size #1152

Open oogene opened 2 months ago

oogene commented 2 months ago

Hi, i used a peptide database build by myself to generate a spectral library, however, i found that when i used the library to search, the outcome file is only 1kb with empty information. and i checked the log of the library building, it shows the library contains 0 proteins and 0 genes, i was confused , please give me some advice. thank u!

vdemichev commented 2 months ago

Hi,

Can you please illustrate with logs & screenshots what you do and what you observe and provide more details on any questions that arise on Linux?

Best, Vadim

oogene commented 2 months ago

Hi,

Can you please illustrate with logs & screenshots what you do and what you observe and provide more details on any questions that arise on Linux?

Best, Vadim

Thank u for your reply, here are some details of my problem. it contains 0 proteins and 0 genes when building a spectral library, but i get a library finally, however, i found the library can't be used in the following search, although there is no error or warning in the search, the output file is empty. by the way, the protein ids in the fasta file are also named by myself, i wonder whether it will make an impact.

微信图片_20240902225919 微信图片_20240902230532 屏幕截图 2024-09-02 230001
vdemichev commented 2 months ago

it contains 0 proteins and 0 genes

Means DIA-NN was not able to read protein or gene names correctly from the FASTA, which can happen if the FASTA is not in UniProt format. Protein sequence IDs however were read correctly, i.e. it's perfectly fine to use DIA-NN like this, just switch protein inference to the isoforms mode.

although there is no error or warning in the search, the output file is empty

On the screenshot no raw files are specified, that means the main output report will be empty. What was generated during the run with settings on the screenshot is the noncode_7-100_lib.predicted.speclib file which you can now use to analyse the raw data.

Best, Vadim

oogene commented 2 months ago

Hi, Vadim i have tried to switch protein inference to the isoforms mode to generated a library, however it still showed 0 genes and 0 proteins in the log. Should i try to modify the format of my fasta file next? or can you give me some other advice?

vdemichev commented 2 months ago

That’s perfectly fine, your analysis will contain correct sequence id-level protein information.

oogene commented 2 months ago

Hi, Vadim I haven't solve my problem yet. First, i switched protein inference to the isoforms mode to generated a library, in this step i didn't input any raw files, just a fasta file, the settings are shown in the first picture. Then i can get a library, whose log shows the library cotains 0 genes and 0 proteins. The screenshot of the log is showed in the second picture. Then i used this library in the following analysis, although the process didn't contains any erros or warnings, i get none of outputs, we can see on the third picture.

image image image
vdemichev commented 2 months ago

In the second picture, the diann.exe process crashes as the system does not have enough RAM to support analysis with such a huge library. To reduce RAM usage, I would suggest to (i) Speed & RAM usage switched to Low RAM mode, (ii) precursor charge range to 2-3 and precursor mass range to the actual range of the experiment, (iii) fix mass accuracies and the scan window. If still not enough RAM, use --ref. If still not enough (unlikely), try splitting precursor mass range as described in the "How to reduce memory usage/speed up DIA-NN when analysing in library-free mode?" section here https://github.com/vdemichev/DiaNN?tab=readme-ov-file#frequently-asked-questions.