The code currently passes over all non-photo fields, including textboxes with text in them, drop-down menus with items selected, and check blocks that are checked.
I tried to figure out how to deal with this into the code's current structure, by incorporating something into the handle_field method. But I couldn't figure it out. So instead I made a pre-processing function that goes through and finds those types of fields, and replaces those fields with whatever text was supposed to be there: the entered text if a textbox, the selected text if a drop-down list (or the default if that's appropriate), and a "Yes" or "No" if it was a checkblock. Then, when you run it through the converter it will come out as plain text.
This required regex rather than re.
I'm not submitting this as a pull request because I'm not sure where you'd want to include this sort of pre-processing. But if you want to do so, here is the function:
import regex
def flattenrtffields(rawrtf):
#get all "fields" including nested
fieldsearch=regex.compile(r"{\\field[^{]*?({(?>[^{}]+|(?1))*})({(?>[^{}]+|(?1))*})}")
m = fieldsearch.finditer(rawrtf)
if m:
textboxes,drops,checks=[],[],[]
checkboxoptions=["No","Yes"]
#Make lists of the kinds of fields to flatten
for field in m:
if "FORMTEXT" in field[0]:
textboxes.append(field[0])
elif "FORMDROPDOWN" in field[0]:
drops.append(field[0])
elif "FORMCHECKBOX" in field[0]:
checks.append(field[0])
else:
pass
#deal with textboxes
for textbox in textboxes:
try:
result = regex.search(r"fldrslt ({(?>[^{}]+|(?1))*})}",textbox)[1]
if result:
rawrtf=rawrtf.replace(textbox,result)
except:
pass
#deal with dropdownlists
for drop in drops:
try:
ddresult = regex.search(r"fftype2.*ffres([0-9]*)",drop)[1]
if ddresult=="25":
ddresult=regex.search(r"ffdefres([0-9]*)",drop)[1]
ddlist = re.findall(r"ffl ([^}]*)}",drop)
rawrtf=rawrtf.replace(drop,"{\\rtlch "+ddlist[int(ddresult)]+"}")
except:
pass
#deal with checkboxes
for check in checks:
try:
result = regex.search(r"fftype1.*ffres([0-9]*)",check)[1]
if result=="25":
result=regex.search(r"ffdefres([0-9]*)",check)[1]
rawrtf=rawrtf.replace(check,"{\\rtlch "+checkboxoptions[int(ddresult)]+"}")
except:
pass
return rawrtf
The code currently passes over all non-photo fields, including textboxes with text in them, drop-down menus with items selected, and check blocks that are checked.
I tried to figure out how to deal with this into the code's current structure, by incorporating something into the handle_field method. But I couldn't figure it out. So instead I made a pre-processing function that goes through and finds those types of fields, and replaces those fields with whatever text was supposed to be there: the entered text if a textbox, the selected text if a drop-down list (or the default if that's appropriate), and a "Yes" or "No" if it was a checkblock. Then, when you run it through the converter it will come out as plain text.
This required regex rather than re.
I'm not submitting this as a pull request because I'm not sure where you'd want to include this sort of pre-processing. But if you want to do so, here is the function: