custom FASTA database - Githubissues

Dear Vadim,

The background is as follows: I am using a custom FASTA database, which is similar to the uniprot format, but has not annotated, with only sequence IDs and missing protein names and genes. Here is an example:

lcl|ORF35_ENST00000650931.1:974:1171|unnamed protein product MKDLNVKTQTIKTLEENLGNTIQDMGTGKYFMTKMPKAVATKAKIDKWNLIKLKSFCTAKKLSSE

1.Why did the "protein group" in the result report not recognize the protein ID of the custom database? I found that it outputs the uniprot ID, and then I tried to choose "protein inference" as "isoform" or "off", but the result did not change. 2.When I changed the uniprot database and compared the results of these two tests, I found that the content of the Protein Group、 Protein ids, and the quantitative results (PG. Quantity, Max LFQ) obtained by both were the same, only the number of output protein names was different. What is the reason for this? I don't understand why databases are different but quantitative results are consistent. 3.If I use the R package to replace or add missing protein IDs, how do I find the corresponding relationship between the ID and the report?

Here’s my log docs: diann.exe --f E:\THC_peptide\MS-DIA\IPX0001444000\Discovery_M\A20180430sunyt_TPD_DIA_b4-12.raw --f E:\THC_peptide\MS-DIA\IPX0001444000\Discovery_M\A20180430sunyt_TPD_DIA_b4-14.raw --lib E:\THC_peptide\MS-DIA\DIA-nn\spectral library\TPD_SPNlibrary_60min_46files_filter20210517_oldos.tsv --threads 4 --verbose 1 --out E:\THC_peptide\MS-DIA\DIA-nn\0102\0102report.tsv --qvalue 0.01 --matrices --out-lib E:\THC_peptide\MS-DIA\DIA-nn\0102\0102.tsv --gen-spec-lib --fasta E:\Select_lncRNA_ORF.fa --met-excision --cut K,R --var-mods 1 --var-mod UniMod:35,15.994915,M --reanalyse --relaxed-prot-inf --smart-profiling --pg-level 0 --peak-center --no-ifs-removal

Thank you for your response, and I wish you a good day.

Kind regards Mira

vdemichev / DiaNN

custom FASTA database #894