Closed laurajackson closed 5 years ago
Hi @laurajackson ,
Right, you won't be able to use get_characters()
for this, since transforming the NeXML structure (which can be nested to arbitrary depth) into a single table (e.g. just rows and columns) will always involve some loss of information. To avoid this, you must stick with a nested data structure such as the nexml
object class provided by RNeXML.
It's a unfortunate reality that such nested structures (lists, S4 objects, XML) are harder to work with than tabular data / data.frames, that's the price of the richer & more flexible format. But it is still entirely possible to do what you want, but it won't just involve removing a "column" since you have to work with non-tabular data.
I don't think it makes sense to provide a dedicated function to "drop some data". I don't really understand this use case -- what harm is it to carry around the extra data? NeXML was never meant to be provide smallest possible file sizes. Still, you can modify the child elements of nexmll@characters
however you see fit. Exactly how you drop the character will depend on what characters block it appears in, if it is a continuous or discrete character, etc. You will also have to decide if you just intend to drop the character itself or also other metadata that may exist about the character.
@laurajackson Did you figure out what you need here? As discussed above, retaining metadata shouldn't be a problem if you use R's S4 structure, but a character matrix is a fundamentally less metadata-rich data structure, and any function that coerces XML into that format will be lossy. Does that make sense or am I missing something here?
closing issue as stale.
I am currently using the RNeXML package to import the following data matrix in XML format, modify it to remove some of the character matrix data, then get the same XML file back keeping all the original metadata from the input file. I am able to get the modified matrix as a .csv file, but am unable to convert this new matrix into an XML that still contains the original data from the input file. I understand that when you convert this file to a data.frame, you loose all the associated metadata from the XML file, is there a function available in the current package that allows me to keep this data?