dilshod / xlsx2csv

Convert xslx to csv, it is fast, and works for huge xlsx files
MIT License
1.64k stars 302 forks source link

input: add "-" notation for reading from stdin #264

Closed ferdinandyb closed 11 months ago

ferdinandyb commented 11 months ago

Currently, the only way to read from stdin is using a workaround, by running xlsx2csv /dev/stdin < example.xlsx. Add support for the standard notation of using "-" where a file argument would be expected to read from stdin. This allows for writing xlsx2csv - < example.xslx but more importantly in | xlsx2csv -.

This commit only adds support for python3 and will fall back to the previous behaviour for python2. For compatibility reasons it does not use type=argparse.FileType('r'), but parses "-" manually.

Fixes: #263

ferdinandyb commented 11 months ago

Credit where credit's due: https://stackoverflow.com/a/76780830/2241241

Konfekt commented 7 months ago

Hello, was there a test before this went productive? How can it possibly work in view of

#  xlsx2csv.py (lines 1216-1230)
try:
    if os.path.isdir(options.infile):
        convert_recursive(options.infile, sheetid, outfile, kwargs)
    elif not os.path.exists(options.infile):
        raise InvalidXlsxFileException("Input file not found!")
    else:
        xlsx2csv = Xlsx2csv(options.infile, **kwargs)
        if options.sheetname:
            sheetid = xlsx2csv.getSheetIdByName(options.sheetname)
            if not sheetid:
                sys.exit("Sheet '%s' not found" % options.sheetname)
        xlsx2csv.convert(outfile, sheetid)
except XlsxException:
    _, e, _ = sys.exc_info()
    sys.exit(str(e) + "\n")

Where is this documented, say in the output --help?

Konfekt commented 7 months ago

That said, it would be a useful feature as Excel files are still common in many companies and https://github.com/phiresky/ripgrep-all/ is missing an adapter for those reading them from STDIn.

ferdinandyb commented 7 months ago

On Wed Dec 06, 2023 at 17:43, Enno @.***> wrote:

Hello, was there a test before this went productive? How can it possibly work in view of

It works for me, I use it quite often:

https://github.com/ferdinandyb/dotfiles/blob/b63d089700a949199b4b0dc51f34ebbe301025db/.config/aerc/aerc.conf#L380

#  xlsx2csv.py (lines 1216-1230)
try:
    if os.path.isdir(options.infile):
        convert_recursive(options.infile, sheetid, outfile, kwargs)
    elif not os.path.exists(options.infile):
        raise InvalidXlsxFileException("Input file not found!")
    else:
        xlsx2csv = Xlsx2csv(options.infile, **kwargs)
        if options.sheetname:
            sheetid = xlsx2csv.getSheetIdByName(options.sheetname)
            if not sheetid:
                sys.exit("Sheet '%s' not found" % options.sheetname)
        xlsx2csv.convert(outfile, sheetid)
except XlsxException:
    _, e, _ = sys.exc_info()
    sys.exit(str(e) + "\n")

I'm not sure what you see in this snippet, but it has been some time since I wrote the patch so ...

Where is this documented, say in the output --help?

I may have forgotten about that :)

Konfekt commented 7 months ago

Thank you. When I run this against cat file.xlsx | xlsx2csv - with latest xlsx2csv it complains about no found input file. Indeed, as by

#  xlsx2csv.py (line 1140)
options.infile = args[0]

it is simply the first non-option argument?

ferdinandyb commented 7 months ago

On Wed Dec 06, 2023 at 23:32, Enno @.***> wrote:

Thank you. When I run this against cat file.xlsx | xlsx2csv - with latest xlsx2csv it complains about no found input file. Indeed, as by

#  .local/pipx/venvs/xlsx2csv/lib/python3.11/site-packages/xlsx2csv.py (line 1140)
options.infile = args[0]

it is simply the first non-option argument?

❯ xlsx2csv --version 0.8.1

I just tested with cat and it worked. Can you verify your version?

Konfekt commented 7 months ago

It 0.8.1. To be precise, the latest commit as the release did not contain this pull request

ferdinandyb commented 7 months ago

@Konfekt I just checked, this PR was broken at some point.

ferdinandyb commented 7 months ago

@Konfekt you can test #271, especially if you have Windows at hand it would be nice.

Konfekt commented 7 months ago

Yes, thank you, that addition

 and options.infile != "-"

in

#  xlsx2csv.py (line 1219)
elif not os.path.exists(options.infile)  and options.infile != "-":

solves it. I am (regrettably?!) not on Windows, though.

Sorry, that it didn't come to mind that the check for an existent file was added after your commit; there were few and it seemed like something that must have been there for a while.

Konfekt commented 7 months ago

This issue settled, how about a new release containing this stdin feature at some point?

ferdinandyb commented 7 months ago

@Konfekt You'll need to ping the maintainer about that, I'm not sure he even saw this thread.

dilshod commented 7 months ago

I've just created a new version 0.8.2, it is uploaded to pypi

Konfekt commented 7 months ago

Thank you @dilshod ; I was a bit shy.