uwescience / sqlshare

Documentation and help for the SQLShare project
escience.washington.edu/sqlshare
7 stars 2 forks source link

Problem running query: Arithmetic overflow error converting varchar to data type numeric. #10

Closed mgavery closed 11 years ago

mgavery commented 11 years ago

I used the following statement to filter and reformat a table:

SELECT
chr as seqname,
'methratio' as source,
'CpG' as feature, pos as start, pos + 1 as [end], cast(ratio as float) as score,
strand,
'.' as frame,
'.' as attribute FROM [sr320@washington.edu].
[BiGill_methratio_v9_A.txt]

where context like '_CG' and CT_Count >= 5 and ratio <> 'NA'

It worked great and gave me the output I wanted.

...then when I tried to run the same thing, but also add: "and ratio >= 0.500" I received the following error: "Problem running query: Arithmetic overflow error converting varchar to data type numeric."

Is there a way to make this work? Thanks!

dhalperi commented 11 years ago

I'm not sure why the bug happens, and I'm working on it. For now, this is a workaround:

and cast(ratio as float) >= 0.500

mgavery commented 11 years ago

That worked, thanks!

sr320 commented 11 years ago

I am trying to replicate using different file, though still getting the error. Problem running query: Error converting data type varchar to float.

Any ideas?

Query:

SELECT 
chr as seqname,  
'methratio' as source,  
'CpG' as feature, 
pos as start,
pos + 1 as [end],
cast(ratio as float) as score, 
strand,  
'.' as frame,  
'.' as attribute  

FROM [sr320@washington.edu].     
[BiGO_betty_plain_methratio_v1.txt] betty 
where 
context like '__CG_' --_=single character wildcard
and
CT_Count >= 5
and cast(ratio as float) >= 0.500

dhalperi commented 11 years ago

When you paste a big block of SQL code in here, please try and preface it by the string "sql", and end it with the string "". This makes the code readable :).

See the following: (edit this comment to see how I typed it.)

SELECT *
FROM TABLE
dhalperi commented 11 years ago

As with all computer problems, the way to fix this is to break it down into steps. I think the biggest problem with these hard-to-cleanup datasets is that you try and combine it in one step. I suggest you do something like this:

  1. Derive a "clean_XXXX" dataset from "XXX" dataset that has all the columns cast to the right types (and bad rows filtered out). See the dataset here.
  2. Now write queries that derive from the clean version. I copied-and-pasted your query verbatim, but based it off the "clean" dataset, and it worked, here
sr320 commented 11 years ago

Thanks. This totally makes sense.

The reason I am trying to combine everything is that these are parts of workflows we are publishing as part of manuscripts and I am trying to make the steps as simple and elegant as possible.

I will take your advice and start the workflows with a "clean" step.