Open skrakau opened 3 years ago
prediction_id
is not needed ...
Regarding the size of the peptide tables and memory, for my current datasets:
# proteins: 1,855,616
# non-unique peptides (9mers) across proteins (multiple occurrences within one protein not counted!): 552,451,599
# unique peptides:
392,722,935
Peak memory for generate_peptides
:
peak_vmem=176,900,708
New model containing entities as an additional link between microbiomes and proteins, modelling the linking entity aka taxa, MAGs/bins or assembly contigs
New color coding:
Orange -> provided or pre-computed entities Gray -> associations Purple -> Pipeline output
protein_orig_id
ist missing
Current draft proposal