tableau / document-api-python

Create and modify Tableau workbook and datasource files
https://tableau.github.io/document-api-python/
MIT License
323 stars 177 forks source link

Need the ability to read from and write to the <desc> element within .tds files #72

Open ugamarkj opened 7 years ago

ugamarkj commented 7 years ago

In order to more efficiently manage metadata documentation regarding fields in Tableau data sources, I need the ability to read the information in the element for a column so that I can write that information to a database. Conversely I need to be able to write information back out to this element after reading the information stored in a database. It will also be necessary for me to be able to determine what table a field comes from. Below is an example of the data I want to be able to read/write.

  <column datatype='string' name='[STATUS]' role='dimension' type='nominal'>
    <desc>
      <formatted-text>
        <run bold='true' fontsize='10'>Case Mix Index VAL</run>
        <run fontcolor='#686868'>&#10;The MSDRG weight for inpatients with charges &gt; 0. Does not include inpatient rehab, normal&#10;newborns or MSDRG 999.</run>
        <run bold='true' fontcolor='#297a98'>&#10;&#10;HSP_ACCT_MULT_DRGS.DRG_WEIGHT (HAR 651)</run>
      </formatted-text>
    </desc>
  </column>
t8y8 commented 7 years ago

A first pass could be to add something in a simple unformatted tag -- I worry about the RTF-ness of our descriptions and trying to build an XML serializer to describe the rich text

ugamarkj commented 7 years ago

I would be fine reading and writing everything in the tag as a block and handling the the values during pre/post processing.

On Aug 19, 2016, at 5:35 PM, Tyler Doyle notifications@github.com wrote:

A first pass could be to add something in unformatted -- I worry about the RTF-ness of our descriptions

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tableau/document-api-python/issues/72#issuecomment-241141507, or mute the thread https://github.com/notifications/unsubscribe-auth/AUJiGb_XD_dP3dcpdfICOj-npxabQ55Uks5qhiG1gaJpZM4JonAJ.

graysonarts commented 7 years ago

Reading should be straight-forward to add since we are already parsing some of the information out of the column tags. Writing is a bit trickier since we don't currently support writing fields.

I'll see if I can get reading done this week so it's included in the august release.

t8y8 commented 7 years ago

Just a note that I'm doing a little research on what rtf-generating libs exist out there already.

https://github.com/grangier/pyrtf exists but seems like overkill

t8y8 commented 7 years ago

@RussTheAerialist I'm thinking of an approach like this, so we don't have to re-implement RTF or take a dependency on pyrtf/pyrtf-ng:

field.description.update_text('blah blah blah')

will create a simple RTF blob of

      <formatted-text>
        <run fontsize='10'>Blah blah blah</run>
      </formatted-text>

And then we can also have

field.description.from_rtf(<RTF OBJECT FROM PYRTF OR OTHER THINGY>)

Which will insert the raw xml blob.

ugamarkj commented 7 years ago

Could you also do a .update_xml() function so the end user could do their own formatting with the tags? Or are you concerned the user will introduce malformed XML corrupting the file that way?

Sent from my iPhone

On Sep 7, 2016, at 4:45 PM, Tyler Doyle notifications@github.com wrote:

@RussTheAerialist I'm thinking of an approach like this, so we don't have to re-implement RTF or take a dependency on pyrtg/pyrtf-ng:

1)

field.description.update_text('blah blah blah') will create a simple RTF blob of

  <formatted-text>
    <run fontsize='10'>Blah blah blah</run>
  </formatted-text>

And then we can also have

field.description.from_rtf()

Which will insert the raw xml blob.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

graysonarts commented 7 years ago

I'm planning to add the ability to change the formatted text. There is some pending work that needs to get done before we start enabling more edit scenarios.

I prefer the idea of an update method you can pass in text, rtf-based xml, or an rtf object that hopefully has an as_string or something similar so we can support various different ways of updating without having the change the signature.

I have some high-priority other things I need to get done unrelated to the document-api for September, so I'm not sure when I'll get to it.

ismailsimsek commented 6 years ago

Im interested this feature too. my use-case is :

  1. finding source schema.table.column per data-source field
  2. pulling field comment from database (this part happens outside of document api)
  3. updating field description.

thanks a lot for all the effort, this API already being very useful to my work.