Appendium / flatpack

CSV/Tab Delimited and Fixed Length Parser and Writer
http://flatpack.sf.net
Apache License 2.0
57 stars 20 forks source link

Make fixed length metadata available #55

Open dmichalski opened 4 years ago

dmichalski commented 4 years ago

Hi,

I have a use case where I read a file with fixed length columns and write some of the rows back into another file (with same format). I can't use getRawdata() as the memory footprint would be too high.

My problem now is that whenever I access a column which is filled with blanks in the input file it is treated as empty and therefore the output row becomes shorter than the original. E.g.::

Mapping:

`

<COLUMN name="ColB" length="4" />
<COLUMN name="ColC" length="4" />

Input: AAAA CCCC`

Output (getString("ColA") + getString("ColB") + getString("ColC")): AAAACCCC

If I would know the metadata of the col (as defined e.g. in the metaData attribute in the RwoRecord) a possible workaround would be to use String getString(String var1, Supplier<String> var2);. Of course an extension to getString to do the filling automatically would be even greater imho.

So questions are:

  1. Am I missing something out? Is there functionality to do what I want?

  2. Is there a reason for the metasata not accessible?

Greetings,

Dennis

benoitx commented 4 years ago

Hi Dennis

I'd be happy to check this but it would really help if you could provide with a Unit Test that I could include in the build?

Many thanks

Benoit

On Tue, 2 Jun 2020 at 12:52, dmichalski notifications@github.com wrote:

Hi,

I have a use case where I read a file with fixed length columns and write some of the rows back into another file (with same format). I can't use getRawdata() as the memory footprint would be too high.

My problem now is that whenever I access a column which is filled with blanks in the input file it is treated as empty and therefore the output row becomes shorter than the original. E.g.::

Mapping:

Input: AAAA CCCC

Output (getString("ColA") + getString("ColB") + getString("ColC")): AAAACCCC

If I would know the metadata of the col (as defined e.g. in the metaData attribute in the RwoRecord) a possible workaround would be to use String getString(String var1, Supplier var2);. Of course an extension to getString to do the filling automatically would be even greater imho.

So questions are:

1.

Am I missing something out? Is there functionality to do what I want? 2.

Is there a reason for the metasata not accessible?

Greetings,

Dennis

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Appendium/flatpack/issues/55, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB542I52JYITKV6RRHZNLTRUTRYLANCNFSM4NQU5G3A .

dmichalski commented 4 years ago

Hi,

I am willing to help as far I can but don't understand exactly what you need. A test which reflects the desired behaviour?

Greetings Dennis

Benoit Xhenseval notifications@github.com schrieb am Di., 2. Juni 2020, 17:24:

Hi Dennis

I'd be happy to check this but it would really help if you could provide with a Unit Test that I could include in the build?

Many thanks

Benoit

On Tue, 2 Jun 2020 at 12:52, dmichalski notifications@github.com wrote:

Hi,

I have a use case where I read a file with fixed length columns and write some of the rows back into another file (with same format). I can't use getRawdata() as the memory footprint would be too high.

My problem now is that whenever I access a column which is filled with blanks in the input file it is treated as empty and therefore the output row becomes shorter than the original. E.g.::

Mapping:

Input: AAAA CCCC

Output (getString("ColA") + getString("ColB") + getString("ColC")): AAAACCCC

If I would know the metadata of the col (as defined e.g. in the metaData attribute in the RwoRecord) a possible workaround would be to use String getString(String var1, Supplier var2);. Of course an extension to getString to do the filling automatically would be even greater imho.

So questions are:

1.

Am I missing something out? Is there functionality to do what I want? 2.

Is there a reason for the metasata not accessible?

Greetings,

Dennis

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Appendium/flatpack/issues/55, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAB542I52JYITKV6RRHZNLTRUTRYLANCNFSM4NQU5G3A

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Appendium/flatpack/issues/55#issuecomment-637613987, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2PHEOZ27VLUMVCGCLUXTLRUUKRRANCNFSM4NQU5G3A .

benoitx commented 4 years ago

Yes, so it will fail until I fix the problem.

Thanks

Benoit

On Tue, 2 Jun 2020 at 19:45, dmichalski notifications@github.com wrote:

Hi,

I am willing to help as far I can but don't understand exactly what you need. A test which reflects the desired behaviour?

Greetings Dennis

Benoit Xhenseval notifications@github.com schrieb am Di., 2. Juni 2020, 17:24:

Hi Dennis

I'd be happy to check this but it would really help if you could provide with a Unit Test that I could include in the build?

Many thanks

Benoit

On Tue, 2 Jun 2020 at 12:52, dmichalski notifications@github.com wrote:

Hi,

I have a use case where I read a file with fixed length columns and write some of the rows back into another file (with same format). I can't use getRawdata() as the memory footprint would be too high.

My problem now is that whenever I access a column which is filled with blanks in the input file it is treated as empty and therefore the output row becomes shorter than the original. E.g.::

Mapping:

Input: AAAA CCCC

Output (getString("ColA") + getString("ColB") + getString("ColC")): AAAACCCC

If I would know the metadata of the col (as defined e.g. in the metaData attribute in the RwoRecord) a possible workaround would be to use String getString(String var1, Supplier var2);. Of course an extension to getString to do the filling automatically would be even greater imho.

So questions are:

1.

Am I missing something out? Is there functionality to do what I want? 2.

Is there a reason for the metasata not accessible?

Greetings,

Dennis

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Appendium/flatpack/issues/55, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AAB542I52JYITKV6RRHZNLTRUTRYLANCNFSM4NQU5G3A

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/Appendium/flatpack/issues/55#issuecomment-637613987 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AA2PHEOZ27VLUMVCGCLUXTLRUUKRRANCNFSM4NQU5G3A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Appendium/flatpack/issues/55#issuecomment-637737705, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB542LLHVLUYIKXLZRCEGDRUVCERANCNFSM4NQU5G3A .

dmichalski commented 4 years ago

Ok. As I said it is not really a problem, more a lack of feature and the implementation of the feature could be discussed. But anyway I wrote tests for two ideas how to do it.

Hope that helps...

FixedLengthStringRetrievalTest.java.txt