invoice-x / invoice2data

Extract structured data from PDF invoices
MIT License
1.83k stars 479 forks source link

Draft:Windows specific prefix check (magick), add support for windows in github actions #555

Open alexm96 opened 6 months ago

alexm96 commented 6 months ago

Noticed an issue on windows that just using convert isn't correct as it conflicts with a native command in windows. Instead just checking if magick exists should be enough, then convert can be prefixed with that when running the full command. Also ran black for tesseract.py, which unfortunately reformatted some stuff :(

bosd commented 6 months ago

Good catch. Thanks for this PR. Is it possible to separate the black formatting in a separate commit?

I just realized that unit tests here on gh, don't test it on windows. Can you add those to gh workflow? 🙏

alexm96 commented 6 months ago

@bosd sure thing, I'll switch it to draft for now and play around with it :)

alexm96 commented 6 months ago

@bosd btw, i've modified the main.yml accordingly. Have been playing around with it on my forked branch and some tests are failing, so i'll keep grinding away at it. I'll probably end up splitting it out , since there seems to be some issues with windows GA running in parallel, where they fail to lock. Anywaz, i'll keep you posted in here once it's ready to review :)

bosd commented 6 months ago

Thanks for the update. Earlier, I made attempt to let the actions run here. I'll hold off for now, as you can test it on your own repo.

bosd commented 6 months ago

Maybe this is of help. Tests are running here correctly on a matrix of ubuntu releases and one windows. https://github.com/py-pdf/pypdf_table_extraction/blob/main/.github/workflows/tests.yml

alexm96 commented 6 months ago

Yeah, if it's fine to only test against one Python version I would do it like that. Otherwise I would run the windows ones sequentially :)

bosd commented 3 months ago

Oopsie, It looks like I messed up this PR, while trying to resolve merge conflicts

bosd commented 3 months ago

superseeded by #565 @alexm96 Can you help fixing the tests.

Yeah, if it's fine to only test against one Python version I would do it like that.

Sounds like an good idea, Windows tests take a lot of compute and time to complete when running against the full matrix.