madscatt / zazzie

development branch
GNU General Public License v3.0
2 stars 3 forks source link

PDBRX report: "edit the system segmentation" #116

Open cjeong73 opened 6 years ago

cjeong73 commented 6 years ago

A> mskinvnvenVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk B> mskinvnveNVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk C> mskinvnveNVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk D> mskinvnveNVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk

Current sequences (lowercase indicates residues not in coordinates): A: mskinvnvenVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK C: NVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK B: NVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK D: NVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk

A: VSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK C: NVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK B: NVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK D: NVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMK

cjeong73 commented 6 years ago

In terminal_edit_options module of preprocess.py, I replace mol.segname_info.sequence_to_fasta with mol.chain_info.sequence_to_fasta to get fasta sequence.

        print("Current sequences (lowercase indicates residues not in coordinates): ")

        for segname in seq_segnames:
          #  seq = mol.segname_info.sequence_to_fasta(
            seq = mol.chain_info.sequence_to_fasta(
                segname, missing_lower=True)
            print(segname + ':')
            print(seq)

Then, non-default mode printed out the segmentation info correctly as below. Need to find out the occurrence of this error in the codes.

Current sequences (lowercase indicates residues not in coordinates): A: mskinvnvenVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk C: mskinvnveNVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk B: mskinvnveNVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk D: mskinvnveNVSGVQGFLFHTDGKESYGYRAFINGVEIGIKDIETVQGFQQIIPSINISKSDVEAIRKAMKk

madscatt commented 6 years ago

While this gives the correct output it is only for a "print" statement.

cjeong73 commented 6 years ago

Yes, it is only for print statement. Then, I tested what happens if segname_info is replaced with chain_info in all pdbrx codes of dev branch at onsager. This change corrected the output psf and pdb following the sequence info of pdbscan at least for dev branch codes.

madscatt commented 6 years ago

That is a possible solution, perhaps a better solution (as segname/segment is the currency involved) is to update the segname_info object. I am very uneasy with finding solutions by replacing bits of code with out understanding the consequences.

madscatt commented 6 years ago

See issue report in #117

dww100 commented 6 years ago

@cjeong73 What is the status of this post the fixes for #117?