IBM / Project_CodeNet

This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX
Apache License 2.0
1.55k stars 193 forks source link

Add ability to append output to existing file. #14

Closed riels89 closed 3 years ago

riels89 commented 3 years ago

Usecase: When processing large numbers of files the list of files may not fit in the arguments list and batching is required, however, this would overwrite the current output file. An append option would fix this issue.

geert56 commented 3 years ago

Although when using stdout, no append option would be necessary, in general when using -o<file> it could be convenient to keep appending tokens to one and the same file with repeated invocations of tokenize. I have modified the code accordingly (as you suggested). Be patient for a new version.

I am closing this request now.