pjotrp / bioruby-table

Swiss knife of tabular data
http://biogems.info/
MIT License
12 stars 7 forks source link

illegal seek when trying to do statistics on stdin #21

Closed wwood closed 10 years ago

wwood commented 11 years ago

Hey I'm liking this new stats thing - solved my problem when I wasn't even looking for it.

$ cat >/tmp/ta
#"bid"  "cid"   "length"
1   a   4658
1   b   12060
2   c   5858
2   d   5626
3   e   18451
$ bio-table --in-format tab --statistics /tmp/ta
bio-table 0.8.0 Copyright (C) 2012 Pjotr Prins <pjotr.prins@thebird.nl>

 INFO bio-table: Array: [{:show_help=>false, :write_header=>true, :skip=>0, :in_format=>:tab, :statistics=>true}, ["/tmp/ta"]]
 INFO bio-table: Array: ["#\"bid\"  \"cid\"   \"length\""]
stat    #"bid"  "cid"   "length"
size    5
min 1.0
max 3.0
median  2.0
mean    1.8
sd  0.8366600265340756
cv  0.4648111258522642

OK, so it doesn't give me the 3rd column (because the 2nd column was non-numeric?), but I guess I can live with that. But the real problem is this:

$ cat /tmp/ta |bio-table --in-format tab --statistics
bio-table 0.8.0 Copyright (C) 2012 Pjotr Prins <pjotr.prins@thebird.nl>

 INFO bio-table: Array: [{:show_help=>false, :write_header=>true, :skip=>0, :in_format=>:tab, :statistics=>true}, []]
 INFO bio-table: Array: ["#\"bid\"  \"cid\"   \"length\""]
#"bid"  "cid"   "length"
1   a   4658
1   b   12060
2   c   5858
2   d   5626
3   e   18451
stat    NA
size
min
max
median
mean
sd
cv
uqbwoodc@brown:20130129:~/RCII_prevalence/geochemistry_relationship$ cat /tmp/ta |bio-table --in-format tab --statistics --with-headers
bio-table 0.8.0 Copyright (C) 2012 Pjotr Prins <pjotr.prins@thebird.nl>

 INFO bio-table: Array: [{:show_help=>false, :write_header=>false, :skip=>0, :in_format=>:tab, :statistics=>true, :with_headers=>true}, []]
 INFO bio-table: Array: ["#\"bid\"  \"cid\"   \"length\""]
/srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/lib/bio-table/tableload.rb:30:in `rewind': Illegal seek - <STDIN> (Errno::ESPIPE)
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/lib/bio-table/tableload.rb:30:in `block (2 levels) in emit'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/lib/bio-table/tableload.rb:19:in `each'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/lib/bio-table/tableload.rb:19:in `each_with_index'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/lib/bio-table/tableload.rb:19:in `block in emit'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/bin/bio-table:236:in `each'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/bin/bio-table:236:in `each'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/gems/bio-table-0.8.0/bin/bio-table:236:in `<top (required)>'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/bin/bio-table:19:in `load'
    from /srv/whitlam/home/users/uqbwoodc/.rvm/gems/ruby-1.9.3-p0/bin/bio-table:19:in `<main>'

Actually there is 2 problems there:

  1. Why do I specify with-headers on the stdin but not when operating directly on the file?
  2. The illegal seek error.

Thanks, not particularly big problems, easily worked around by piping to an intermediate file before running bio-table. Using statsample 1.2.0 if that is relevant. ben

wwood commented 11 years ago

Wait sorry of course I need to specify `with-headers`` on the stdin because I'm not keeping the headers. Sorry. The 2nd issue remains, though.

pjotrp commented 11 years ago

Interesting - how do you find this stuff :). I aim to do some work on bio-table and biogem soon.

You can probably fix this by not using STDIN for statistics.

wwood commented 11 years ago

how do you find this stuff :)

I don't have a rain cloud as an avatar for nothing..

cheers, you are right you can just not use stdin for stats - that's what I ended up doing for my particular case.

pjotrp commented 10 years ago

Statistics should work on STDIN now.