lizmat / App-Rak

21st century grep / find / ack / ag / rg on steroids
Artistic License 2.0
152 stars 7 forks source link

Add option to get CSV parsing as an array of hashes #27

Closed Zer0-Tolerance closed 1 year ago

Zer0-Tolerance commented 1 year ago

it would be nice to get the CSV parsed just like the JSON using a TEXT::CSV:

my @d=csv(in => "/file.csv",headers => "auto",sep_char=> ";");
@d.[1]<field>
lizmat commented 1 year ago

You probably want to look at --per-file=code ?

lizmat commented 1 year ago

Specifically (I think):

$ rak --per-file='use Text::CSV; csv(in => $*SOURCE,headers => "auto",sep_char=> ";")' '.[1]<field>'
Zer0-Tolerance commented 1 year ago

this example does not seem to work:

cat /tmp/test.csv
a;b;c;d
1;2;3;4
4;3;2;1
rak --per-file='use Text::CSV; csv(in => $*SOURCE,headers => "auto",sep_char=> ";")' '{.[1]<a>}' /tmp/test.csv
A worker in a parallel iteration (hyper or race) initiated here:
  in sub show-results at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 848
  in sub rak-results at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 707
  in sub action-per-file at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 2892
  in sub main at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 427
  in block <unit> at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/resources/38D6E99ED5AD76FAFB3578F70FA248D49E4876DF line 3
  in sub MAIN at /Users/.rakubrew/shims/rak line 3
  in block <unit> at /Users/.rakubrew/shims/rak line 1

Died at:
    No such method 'CALL-ME' 
for invocant of type 'Str'
      in sub show-results at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 848
      in sub rak-results at /Users//.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 707
      in sub action-per-file at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 2892
      in sub main at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/sources/32CF7E3D90C6D484F952DE7725034C262FBF56B5 (App::Rak) line 427
      in block <unit> at /Users/.rakubrew/versions/moar-2022.07/share/perl6/site/resources/38D6E99ED5AD76FAFB3578F70FA248D49E4876DF line 3
      in sub MAIN at /Users/.rakubrew/shims/rak line 3
      in block <unit> at /Users/.rakubrew/shims/rak line 1
Zer0-Tolerance commented 1 year ago

As an illustration I've pasted the patch applied on the module to implement it and fixed issue #26 :

Index: lib/App/Rak.rakumod
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/lib/App/Rak.rakumod b/lib/App/Rak.rakumod
--- a/lib/App/Rak.rakumod   (revision 20b8371c090845436e0a1041cae11dc5c1d57b25)
+++ b/lib/App/Rak.rakumod   (date 1668188972546)
@@ -207,7 +207,7 @@
 my %listing;     # listing options specified
 my %csv;         # arguments needed for --csv-per-line
 my %modify;      # arguments needed for --modify-files
-
+my $sep=','; # arguments needed for --sep
 my $needle;  # Callable needle for rak
 my %rak;     # arguments to be sent to rak()
 my $rak;     # the result of calling rak()
@@ -1938,7 +1938,9 @@
 }

 my sub option-sep($value --> Nil) {
-    set-csv-flag('sep', $value);
+    Bool.ACCEPTS($value)
+        ?? meh "'--sep' expects a separator character"
+        !! ($sep := $value);
 }

 my sub option-shell($value --> Nil) {
@@ -2432,8 +2434,8 @@
     %csv<auto-diag> := True unless %csv<auto-diag>:exists;
     my $csv := $TextCSV.new(|%csv);
     %csv = ();
-    %rak<produce-many> := -> $io { $csv.getline-all($io.open) }
-
+#    %rak<produce-many> := -> $io { $csv.getline-all($io.open) }
+    %rak<produce-many> := -> $io { $csv.csv(:headers('auto') :file($io.path) :sep_char($sep)) }
     activate-output-options;
     run-rak;
     rak-results;
lizmat commented 1 year ago

Will get back to this later today or tomorrow

lizmat commented 1 year ago

in -> $io { $csv.csv(:headers('auto') :file($io.path) :sep_char($sep)) }

is the sep_char necessary? Isn't that already set with my $csv := $TextCSV.new(|%csv); ?

Zer0-Tolerance commented 1 year ago

in -> $io { $csv.csv(:headers('auto') :file($io.path) :sep_char($sep)) }

is the sep_char necessary? Isn't that already set with my $csv := $TextCSV.new(|%csv); ?

you're right it should not be necessary but it wasn't working until now. Let me recheck if it's still needed.

Zer0-Tolerance commented 1 year ago

--sep=';' doesn't seems to change the separator until I've created a $sep an :sep_char. maybe you're using the wrong key name in the hash ?

lizmat commented 1 year ago

According to the CSV doc at https://github.com/Tux/CSV/blob/master/doc/Text-CSV.md it may be called "sep", "sep_char", "sep-char" or "separator". So "sep" should work...

lizmat commented 1 year ago
% raku -I. bin/rak --csv-per-line '{ dd $_ }' csv --sep=';' 
Hash %csv = {:auto-diag, :sep(";")}
$["a,b,c,d"]
$["1,2,3,4"]
$["4,3,2,1"]
$["1,2,3,4"]

vs

% raku -I. bin/rak --csv-per-line '{ dd $_ }' csv --sep=','
Hash %csv = {:auto-diag, :sep(",")}
$["a", "b", "c", "d"]
$["1", "2", "3", "4"]
$["4", "3", "2", "1"]
$["1", "2", "3", "4"]

so --sep looks like to me it works?

Zer0-Tolerance commented 1 year ago

I confirm this is working it was due to my patch forcing the :sep_char. So no need to add the sep in this line of code -> $io { $csv.csv(:headers('auto') :file($io.path)) }

lizmat commented 1 year ago

https://github.com/lizmat/App-Rak/commit/2e3d2b5927 and https://github.com/lizmat/App-Rak/commit/cf40ec9841 implement and document a new --headers option.

I've found some weird stuff in the interfacing with Text::CSV, specifically with "uc", "lc" and hash mapping. Pretty sure it's a bug in Text::CSV, will check with Tux in the morning.

lizmat commented 1 year ago

This is now released as 0.2.7, with hashes being the default.

Thanks for the nudges and the headsups!

Zer0-Tolerance commented 1 year ago

that's wonderful !