uwescience / myria-web

Web frontend for Myria
https://demo.myria.cs.washington.edu
Other
11 stars 14 forks source link

problem parsing/submitting query #263

Closed sophieclayton closed 9 years ago

sophieclayton commented 9 years ago

i'm trying to parse/run a query and am getting this error:

<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>500 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered an error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>
dhalperi commented 9 years ago

Query?

sophieclayton commented 9 years ago
min_max = scan(armbrustlab:seaflow:bead_stats_v4_byfile_untrans);
good_opp_vct = scan(armbrustlab:seaflow:good_opp_vct_v4);

min_maxb = select *,
    INT(File_Id) as int_file
    from min_max;
good_opp_vctb = select *,
    INT(File_Id) as int_file
    from good_opp_vct;

adjusted_particles = select d.fsc_small - m.fsc_avg as fsc_adj,
                            d.chl_small - m.chl_avg as chl_adj,
                            d.pe - m.pe_avg as pe_adj,
                            d.Cruise, d.Day, d.int_file as File_Id
                            from min_maxb as m, good_opp_vctb as d
                            where d.pop= "0" 
                            or d.pop="1"
                            or d.pop="2"
                            or d.pop="3"
                            or d.pop="4"
                            or d.pop="5"
                            or d.pop="6"
                            or d.pop="7"
                            or d.pop="8"
                            or d.pop="9"
                            or d.pop="10"
                            or d.pop="cocco"
                            or d.pop="pico2"
                            or d.pop="pico5"
                            or d.pop="pico7"
                            or d.pop="ultra"
                            or d.pop="crypto"
                            or d.pop="nano"
                            or d.pop="pico3"
                            or d.pop="pico"
                            or d.pop="diatoms"
                            or d.pop="lgdiatoms"
                            or d.pop="pico6"
                            or d.pop="pico1"
                            or d.pop="smdiatoms"
                            or d.pop="pico4"
                            or d.pop="nano2"
                            or d.pop="pico9"
                            or d.pop="pico8"
                            or d.pop="prochloro"
                            and d.Cruise = m.Cruise 
                            and d.Day = m.Day 
                            and d.int_file = m.int_file;

store(adjusted_particles, armbrustlab:seaflow:adj_byfile_phytoexp_untrans);
domoritz commented 9 years ago

Just as a side note. I believe and has precedence over or so your query is interpreted as

adjusted_particles = select d.fsc_small - m.fsc_avg as fsc_adj,
                            d.chl_small - m.chl_avg as chl_adj,
                            d.pe - m.pe_avg as pe_adj,
                            d.Cruise, d.Day, d.int_file as File_Id
                            from min_maxb as m, good_opp_vctb as d
                            where d.pop= "0" 
                            or d.pop="1"
                            ...
                            or d.pop="pico4"
                            or d.pop="nano2"
                            or d.pop="pico9"
                            or d.pop="pico8"
                            ((or d.pop="prochloro"
                            and d.Cruise = m.Cruise)
                            and d.Day = m.Day 
                            and d.int_file = m.int_file);

@dhalperi correct me if I'm wrong.

dhalperi commented 9 years ago

@domoritz must be right.

dhalperi commented 9 years ago

My guess is the compiler is spending too long trying to optimize where all those ors go. I'll look into it.

sophieclayton commented 9 years ago

ok, so what it the best way of structuring that query?

On Tue, Mar 10, 2015 at 3:29 PM, Daniel Halperin notifications@github.com wrote:

@domoritz https://github.com/domoritz must be right.

— Reply to this email directly or view it on GitHub https://github.com/uwescience/myria-web/issues/263#issuecomment-78162590 .

Sophie Clayton, PhD School of Oceanography Box 357940 University of Washington Seattle, WA 98195, USA

tel: +1 (206) 685-1047 http://sophieclayton.github.io http://armbrustlab.ocean.washington.edu/people/clayton

dhalperi commented 9 years ago

It would be like this:

adjusted_particles = select d.fsc_small - m.fsc_avg as fsc_adj,
                            d.chl_small - m.chl_avg as chl_adj,
                            d.pe - m.pe_avg as pe_adj,
                            d.Cruise, d.Day, d.int_file as File_Id
                            from min_maxb as m, good_opp_vctb as d
                            where (d.pop= "0" 
                            or d.pop="1"
                            ...
                            or d.pop="pico4"
                            or d.pop="nano2"
                            or d.pop="pico9"
                            or d.pop="pico8"
                            or d.pop="prochloro")
                            and d.Cruise = m.Cruise
                            and d.Day = m.Day 
                            and d.int_file = m.int_file;

all I did was move the parens

dhalperi commented 9 years ago

however that still will take too long to parse – it's taking over a minute on my laptop to parse a similar example.

sophieclayton commented 9 years ago

ok, thanks!

On Tue, Mar 10, 2015 at 3:52 PM, Daniel Halperin notifications@github.com wrote:

It would be like this:

adjusted_particles = select d.fsc_small - m.fsc_avg as fsc_adj, d.chl_small - m.chl_avg as chl_adj, d.pe - m.pe_avg as pe_adj, d.Cruise, d.Day, d.int_file as File_Id from min_maxb as m, good_opp_vctb as d where (d.pop= "0" or d.pop="1" ... or d.pop="pico4" or d.pop="nano2" or d.pop="pico9" or d.pop="pico8" or d.pop="prochloro") and d.Cruise = m.Cruise and d.Day = m.Day and d.int_file = m.int_file;

all I did was move the parens

— Reply to this email directly or view it on GitHub https://github.com/uwescience/myria-web/issues/263#issuecomment-78166038 .

Sophie Clayton, PhD School of Oceanography Box 357940 University of Washington Seattle, WA 98195, USA

tel: +1 (206) 685-1047 http://sophieclayton.github.io http://armbrustlab.ocean.washington.edu/people/clayton

stechu commented 9 years ago

That is actually interesting. Can we add this query into raco test case?

These "or"s on single table should be either pushed down postgres or just be treated as filter conditions of doing scans.

dhalperi commented 9 years ago

Hi Soph, try again using the restructured query (parentheses moved)?