pierrepo / grodecoder

GroDecoder extracts and identifies the molecular components of a structure file (PDB or GRO) issued from a molecular dynamics simulation.
https://grodecoder.streamlit.app/
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Add `is_nucleic_acids` function in grodecoder.py and `NUCLEIC_ACIDS` dictionary in mol_def.py #62

Closed KarinDuong closed 4 months ago

KarinDuong commented 4 months ago

The dictionary will have this form : (according the resname for DNA and RNA in rna.rtp and dna.rtp files of each force field [mostly AMBER and CHARMM, the other don't have these files] in https://github.com/gromacs/gromacs/blob/main/share/top/)

NUCLEIC_ACIDS = {"DA": 'A',
                "DT": 'T', 
                "DC": 'C', 
                "DG": 'G', 

                "DA5": 'A',
                "DA3": 'A',
                "DT5": 'T', 
                "DT3": 'T', 
                "DC5": 'C', 
                "DC3": 'C', 
                "DG5": 'G', 
                "DG3": 'G', 

                "RA": 'A', 
                "RU": 'U', 
                "RC": 'C', 
                "RG": 'G', 

                "RA5": 'A', 
                "RA3": 'A', 
                "RU5": 'U', 
                "RU3": 'U', 
                "RC5": 'C', 
                "RC3": 'C', 
                "RG5": 'G', 
                "RG3": 'G', 
                }
KarinDuong commented 4 months ago

And is_nucleic_acids and extract_nucleic_acids_sequence functions will work the same as is_lipid and extract_protein_sequence. But extract_nucleic_acids_sequence will test if the resname start with 'R' or 'D' to define the molecular_type to 'DNA' or 'RNA'.

KarinDuong commented 4 months ago

And maybe change the field protein_sequence in the molecular inventory to sequence (instead of create a new field nucleic_acids_sequence) ? Then it will be either a protein sequence or a nucleic acids sequence.

pierrepo commented 4 months ago

will test if the resname start with 'R' or 'D' to define the molecular_type to 'DNA' or 'RNA'.

No need to differentiate DNA and RNA.

pierrepo commented 4 months ago

rna.rtp and dna.rtp files of each force field [mostly AMBER and CHARMM, the other don't have these files] in https://github.com/gromacs/gromacs/blob/main/share/top/)

Please mention these files in the comment when defining dictionaries.

pierrepo commented 4 months ago

Could we close this issue?

KarinDuong commented 4 months ago

yes