Open Afollet opened 3 years ago
So, I made some changes to the XlsxReader to change the behavior as I needed ( I also made some changes so it's easier to use the code base in our project): https://github.com/PDXFinder/pdx-validator/blob/dev/src/main/java/org/pdxfinder/validator/ValidatorXlsxReader.java
I made two changes:
I added a basic test for this behavior: https://github.com/PDXFinder/pdx-validator/blob/dev/src/test/java/org/pdxfinder/ValidatorXlsxReaderTest.java
I would be willing to make a PR for these changes on this repo. I was not sure if the maintainers are interested in these changes, so let me know.
I think that it will be solved with #909 if we also do an improvement in type detection like commented on #751
With #909 the reader it's able to read number values in String columns And if #751 it's solved will be a more consistent type detection that doesn't depend on the first cell type
@Afollet If you have time to check and can test with the current master version I think that this issue must be solved
@lujop Can I pull down the snapshot of this version of master? I think that would be easiest for me to test it, but I'm was not able to pull it on https://oss.sonatype.org/
Could you advise me on how to use maven to test these changes?
@Afollet you can just clone the repo and do a mvn install
and it will build and install the tablesaw jar in your local maven repository.
Then use the dependency with the new version as normal and while in the same machine maven will take the dependency from your local repository.
@lujop Yes, I have tested your changes and it has resolved this issue.
Thanks for the great work @lujop . Dev's like you make the world go round.
Hello devs,
Thanks for making/maintaining this library. I am currently in a microservice that validates the data in excel spreadsheets.
I'm reading an excel workbook with multiple sheets in tablesaw-excel.
I have cell types in Excel of both numeric and text in a column. When the table is read in, the numeric cell types become blank cells. Therefore, when I try to pull up missing values I get the rows where the numeric type cells are at
Is there some way I can change the behavior of this using XlsxReadOptions? Or is there another method I can use to avoid this behavior?
Thanks!