Closed davidverweij closed 4 years ago
In my current use of the module, it would be particularly handy to have the output files be named using the data from the .csv. To illustrate, I am generating .docx that need to be sent to the recipients, whose name I fill in the template. In order to trace back which .docx should go to who, it would be convenient to allow parametric customisation of the output file names.
Yes, being able to generate .docx
named by a specific option would be great. Of course, that option must exist in the provided .csv
so we will need to add validation. Therefore, the -n
option must match row headers in the CSV. (maybe add that as the description?)
I am thinking of some kind of overloading (although I understand this is not intended in Python), or adding parameters - and checking their validity after opening the .csv. Then, we would definitely need to allow multiple values to ensure some type of uniqueness (and check for this too). Perhaps list parameter or alike?
We would need to update the variable passed to write
, which is currently counter
. The multiple values must exist in the csv used, so we would need to validate from the fieldnames
. The single_document dict
can be used to get the value we're interested in using for the filename (e.g. name). Before we start to enumerate the csvdict we could have a list named filenames
and append to it with each loop and then use it as a lookup to ensure uniqueness prior to assigning the filename?
Can you think of a more elegant solution than creating a temporary list for comparison?
An alternative could be to update the specified column (lets say its name
) prior to enumeration to better separate the logic, e.g. pass csvdict
to some method which loops over the name
column and updates the values in place (so if name david
appears twice it will become david
and david_2
). Then we can replace counter
with single_document[USER_OPTION]
where USER_OPTION in this case is name.
i was working on this function and something strange is happening i successfully created that verification
the -n option must match row headers in the CSV
but after implementing that feature i can't traverse in csvdict?? i don't know why here is function:
def generate_names(listnm):
newname = []
for i in range(len(listnm)):
if (listnm[i] not in listnm[:i]):
newname.append(listnm[i])
else:
newname.append(listnm[i] + "_" + str(listnm[:i].count(listnm[i]) + 1))
return newname
this is what i added in convert function:
if ((custom_name != None) and (custom_name not in csv_headers)):
print("column name not found")
exit()
else:
file_names = generate_names(list(row[custom_name] for row in csvdict))
after this block i can't traverse in csvdict this function will return a list with names which we can access using: docx.write(f"{file_names[counter]}.docx")
in the end
@salmannotkhan -- feel free to make a draft pull request and I can have a look this evening (GMT+1) to try and understand the issue.
I suspect, although not certain, that the reason you cannot enumerate csvdict
here is likely because you can only iterate over DictReader's once see here. The reason is because opening files using with statements
makes use of generators. When you do list(row[custom_name] for row in csvdict)
you're iterating over the open file and after that, it is closed within the context of the with
statement.
To explore my hypothesis above, pass csvfile
to your method and open it inside generate_items
using a with
statement? An alternative is to use seek(0)
to use the same file ... but this feels like a hack.
I had this issue before - and I concur - it iterates using a reader, which is why the code originally opened the .csv twice (a crude fix I admit).
Got it
I'll try to implement the generate_names
function inside output loop so we don't have to open file twice
I'd suggest that instead to abstract the logic to a separate method above as it will make testing it easier. It also keeps the convert method clean and simple 👍
done with this i used seek because i didn't found any other way
Great work -- if you make a PR I can test it and do a code review for you 👍
yeah sure
In my current use of the module, it would be particularly handy to have the output files be named using the data from the .csv. To illustrate, I am generating .docx that need to be sent to the recipients, whose name I fill in the template. In order to trace back which .docx should go to who, it would be convenient to allow parametric customisation of the output file names.
I am thinking of some kind of overloading (although I understand this is not intended in Python), or adding parameters - and checking their validity after opening the .csv. Then, we would definitely need to allow multiple values to ensure some type of uniqueness (and check for this too). Perhaps list parameter or alike?
E.g.
poetry run convert -t template.docx -c data.csv -n ["FIRSTNAME", "LASTNAME"]
With output files as:
Thoughts?