gtDMMB / RNAStructViz

Visualization, comparison, and analysis of RNA secondary structures via a cross-platform GUI
https://github.com/gtDMMB/RNAStructViz/wiki
GNU General Public License v3.0
17 stars 5 forks source link

Export entire folder data to more detailed XML output formats in future releases? #64

Open maxieds opened 4 years ago

maxieds commented 4 years ago

This would be on the order of adding an "Export to XML" icon next to the folder name in the LHS MainWindow display for sequences (named by unique folder id). The output should include more detailed summaries of all of the loaded structures than we currently give in other supported output formats currently integrated in RNAStructViz.

For example, I recently added the following stub functions to RNAStructure.cpp/h which can be used to generate the strings we will want to store in the output XML file:

/* Functions to generate listings of substructural and pairing properties: */
        inline bool IsAmbiguous() {
             return strchr(GetSequenceString(), 'X') != NULL;
        }

        inline bool IsCanonical(bool skipAmbiguousPairs = false) {
             for(int bidx = 0; bidx < GetLength(); bidx++) {
                  const char bp1 = GetBaseAt(bidx)->getBaseChar(), bp2 = GetBaseAt(GetBaseAt(bidx)->m_pair)->getBaseChar();
                  if(skipAmbiguousPairs && (bp1 == 'X' || bp2 == 'X')) {
                       continue;
                  }
                  bool cpairs = (bp1 == 'A' && bp2 == 'G') || (bp1 == 'G' && bp2 == 'A') ||
                                (bp1 == 'G' && bp2 == 'U') || (bp1 == 'U' && bp2 == 'G') ||
                                (bp1 == 'G' && bp2 == 'C') || (bp1 == 'C' && bp2 == 'G');
                  if(!cpairs) {
                       return false;
                  }
             }
             return true;
        }

        static inline const char *DEFAULT_STRING_LIST_DELIMITER = "\n";

        std::string GetHelicesList(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetWatsonCrickPairs(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetCanonicalPairs(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetNonCanonicalPairs(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetPseudoKnots(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetWobblePairs(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetIsolatedPairs(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
        std::string GetNonIsolatedPairs(std::string strDelim = RNAStructure::DEFAULT_STRING_LIST_DELIMITER);
maxieds commented 4 years ago

The output XML file can / should also contain the DOTBracket-formatted pairing data, the base (nucleotide sequence characters), sequence length, comments, information on file origin (e.g., where on disk did we load this structure data from), possible PostScript data for full radial-layout-style diagrams that are straightforward to generate with Vienna, etc.

@ceheitsch Any other suggestions for output data we would want here?

maxieds commented 4 years ago

This is partially implemented as a place holder in the current GUI: