Open sroet opened 1 year ago
Hi Sander,
You are right, this should be handled better by the program. For now, you could put a glycine so that the numbering of the end model is what you expect. But, hopefully I will push out a fix soon.
Best, Kiarash.
What would be the best way of dealing with these unknown residues, just delete them, replace them with glycines, or something else? Also, it would probably be nice to catch this issue before the start of the C-alpha prediction
I think this is an underrated issue, it doesn't make sense that there isn't a pre-check of the input files prior to the start of the prediction. Sometimes weird formatting or an error in a sequence can take 20 minutes to show up.
@ColdPopeye this should be fixed since v1.0.8
, does it still give you an issue?
@ColdPopeye this should be fixed since
v1.0.8
, does it still give you an issue?
I have a slightly sillier problem, not sure where and when it comes up. I have sequencing results as a word file from which I copy paste them to make a fasta file. Sometimes formatting gets copied or I mis-paste something (I think because windows and WSL behave strangely). In any case the error only comes after the C-alpha prediction which is a bit annoying.
Hey,
I have a fasta file where certain residues are unknown and therefor represented with
X
such as (1 at the start and 1 at place 105):When trying to build against these fasta sequences you get an internal assertion error:
click here for the log file
``` 2023-06-15 at 17:54:09 | INFO | ModelAngelo with args: {'volume_path': '../sharpened.mrc', 'protein_fasta': '../fasta_files/proteins.fa', 'rna_fasta': '../fasta_files/rna.fa', 'dna_fasta': None, 'output_dir': '20230615_fasta', 'mask_path': None, 'device': '0', 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': False, 'func':If I (just) remove the "X" from the fasta sequence it seems to at least build a model without issue (still have to check if it is reasonable for my complete complex).
What would be the best way of dealing with these unknown residues, just delete them, replace them with glycines, or something else? Also, it would probably be nice to catch this issue before the start of the C-alpha prediction