Open cguetot opened 4 years ago
Hi Carlos,
Yes, you need keep seq column and leave anything in that cell (seq won't be used when doing de novo). The reader will search "seq" in the header so deleting that column should raise error.
Let me know if you have any problems when running it.
Best,
Rui
Carlos Gueto-Tettay notifications@github.com 于2020年9月10日周四 上午10:29写道:
Hi,
If I want to make predictions for a new mgf file, do I have to leave an empty cell for the 'seq' column in its feature file?
I noted that the headers of the feature files are defined as follow; "spec_group_id","m/z","z","rt_mean","seq","scans","profile","feature area"
best,
Carlos
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/volpato30/DeepNovoV2/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACB2A6P6VTK3AG3ZES6JPBTSFDPFXANCNFSM4RFFFMXA .
Hi Rui,
I modified my comment so you did not get the change to read my second question:
do you have any setting recommendations (deepnovo_config.py) for data coming from a QExactive-HF? both for training and denovo search.
Carlos
Hi Carlos,
I don't think you need to change parameters for Q Exactive data. Just make sure your training data have relatively similar properties (enzyme, instrument, fragmentation method) as the data you want to perform de novo sequencing.
Rui
Carlos Gueto-Tettay notifications@github.com 于2020年9月10日周四 下午3:49写道:
Hi Rui,
I modified my comment so you did not get the change to read my second question:
do you have any setting recommendations (deepnovo_config.py) for data coming from a QExactive-HF? both for training and denovo search.
Carlos
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/volpato30/DeepNovoV2/issues/2#issuecomment-690680336, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACB2A6OOZ74JMV6QY3ERCPTSFEUUPANCNFSM4RFFFMXA .
can I use the same knapsack file from deepnovo? or are they different?
They are the same. But be careful about the ptm settings (AAs included in vocab_reverse). One knapsack file corresponds to a specfic set of ptms and MZ_MAX. I believe the original deepnovo knapsack is generated with C(Cam), M(oxidation) NQ(Deamidation) and MZ_MAX of 3000.
how can I build a custom knapsack for DeepNovoV2, with, for example, MZ_MAX of 4000 ?
change the MZ_MAX to 4000 in config file, then $>make denovo. When the program detects no knapsack.npy file in the current folder it will start building a new one with the configurations in deepnovo_config.py file
That's perfect.
Thanks,
Carlos
On Wed, Feb 3, 2021, 15:50 volpato30 notifications@github.com wrote:
change the MZ_MAX to 4000 in config file, then $>make denovo. When the program detects no knapsack.npy file in the current folder it will start building a new one with the configurations in deepnovo_config.py file
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/volpato30/DeepNovoV2/issues/2#issuecomment-772565783, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIK5UOB2YYAV27SC5FCLC2TS5FPDXANCNFSM4RFFFMXA .
Hi again,
several questions:
1) how can I increase the training, valid and test sizes? context: I see variables like train_stack_size, valid_stack_size and test_stack_size are not used anymore in this code compared to the old tensorflow version.
2) I also see variable called batch_size with a lower value (32) respect to the original code (128). how does it affect the training process?
3) If I increase "num_workers", will it speed up the calculations?
4) is it possible to get the top n best candidates for each scan?
thanks in advance,
Carlos
Hi,
If I want to make predictions for a new mgf file, do I have to leave an empty cell for the 'seq' column in its feature file?
I noted that the headers of the feature files are defined as follow; "spec_group_id","m/z","z","rt_mean","seq","scans","profile","feature area"
On the other hand, do you have any setting recommendations (deepnovo_config.py) for data coming from a QExactive-HF?
best,
Carlos