apache / hop

Hop Orchestration Platform
https://hop.apache.org/
Apache License 2.0
985 stars 354 forks source link

[Bug]: Excel input transform #4555

Closed mocoxk closed 1 week ago

mocoxk commented 1 week ago

Apache Hop version?

2.10.0

Java version?

openjdk version "17.0.12" 2024-07-16 LTS

Operating system

Linux

What happened?

Error with Excel input. I have a pipeline that generates an Excel XLSX file and in another pipeline I use it as input data, but it is not being read correctly. I have the same pipeline in Pentaho and if I use that file as input in my Hop pipeline, it reads it correctly. So I'm not sure if the problem is with the Excel input or the output that is generating the XLSX file.

excel data Screenshot 2024-11-11 102126 Screenshot 2024-11-11 102201

excel read from transform correct Screenshot 2024-11-11 102011

excel read from transform with problem Screenshot 2024-11-11 101947

In the last image, you can see that the data is being duplicated and the values in column A are being displayed in column E, which is not correct. This error only occurs if I use the XLSX file generated with Apache Hop, if I use the one generated with Pentaho, it works correctly.

Issue Priority

Priority: 3

Issue Component

Component: Transforms

hansva commented 1 week ago

These are not the same values. It's column A +1 Could you provide a sample Excel file that produces this error or sample pipelines to generate the issue? Note: use a zip file to attach the files to this issue as GitHub blocks a lot of file types.

mocoxk commented 1 week ago

I have tried to replicate the error but now it has been working well, perhaps my MongoDB source data had some special characters that generated those jumps in the Excel input.

hansva commented 1 week ago

Maybe you can filter your input to fetch those records that gave the error. If it is a data error feel free to close this ticket

mocoxk commented 1 week ago

Thank you, for now I will close it and I will be checking if a similar error appears to me again.