Open videlc opened 2 years ago
Hi Vivian,
Yes, MaxQuant msms.txt can be used as a library by DIA-NN, although this is experimental functionality and I recommend to export this to DIA-NN's .tsv format and check if all is good. Also, can just format the list of peptides detected by MaxQuant as FASTA, and make an in silico lib from that. Or can use FragPipe instead of MaxQuant and use the library directly in DIA-NN.
Best, Vadim
Posting here a R workaround to select leading accessions of msms.txt and write a corresponding fasta file for anyone who would be interested.
Note that library-free mode performs slower but better than this solution. Note also that this code may be suboptimal and inelegant.
require(tidyverse)
require(janitor)
require(seqinr)
# read (any) msms.txt and concatenate them
msmsfiles<-list.files(pattern = 'msms.txt',recursive = T)
allmsms<-tibble()
for(file in msmsfiles){
temp<-read_delim(file,delim = '\t') %>% clean_names()
allmsms<-bind_rows(allmsms,temp)
}
#selection of leading accession
allprots<-allmsms %>% select(proteins) %>% unique()
allprots$proteins <- gsub(';.*','',allprots$proteins)
#fasta import as table and annotate accession column
fasta<-read.fasta('path/to/fasta.fasta',
seqtype = 'AA',
whole.header = T,
as.string = T) %>% as_tibble() %>% t()
headers<-row.names(fasta)
fasta<-as_tibble(fasta) %>% mutate(header=headers)
#getting uniprot accession number from header
fasta$acc<-gsub('sp\\|','',fasta$header)
fasta$acc<-gsub('tr\\|','',fasta$acc)
fasta$acc<-gsub('\\|.*','',fasta$acc)
fasta$acc<-gsub('CON__','',fasta$acc)
fasta$rown<-1:nrow(fasta)
short_fasta<-tibble()
#selecting proteins that are in allmsms table (from msms.txt)
for(prot in allprots$proteins){
temp<-fasta %>% filter(acc==prot)
short_fasta<-bind_rows(short_fasta,temp)
}
#write fasta with only msms.txt leading accession proteins
write.fasta(sequences = short_fasta$V1 %>% as.list() %>% unique(),names = short_fasta$header %>% as.list() %>% unique(),file.out = 'test.fasta')
Vivian
Hey Vadim,
Since it's my first message here, thanks for developping such an amazing software. I'm wondering if there is any (convenient) way to use any MaxQuant output file (e.g. msms.txt) as spectral library ? I've read here and there that there's a way to perform this using Skyline, also maybe diapysef which are not really straightforward and couldn't find any tutorial.
I could write some R script to "convert" any msms.txt to any working speclib but if there's a simpler way, I'd take it.
Thanks, Vivian