google-code-export / ords

Automatically exported from code.google.com/p/ords
1 stars 0 forks source link

No warning about data loss when fields are converted to Char #571

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a database with fields that have the data type Text or Varchar.
2. Add some data to those fields.
3. Return to the schema designer and change the field type to Char. Don't add a 
size. Save your changes.

What is the expected output? What do you see instead?
A Char field with no size specified defaults to a single character. This means 
that if a Text or Varchar field is changed to a Char field, everything except 
the first character of what is contained in that field will be deleted. There's 
no warning about this - if done by accident, this could result in significant 
data loss.

Please use labels and text to provide additional information.

Original issue reported on code.google.com by meriel.p...@gmail.com on 1 Dec 2014 at 4:46

GoogleCodeExporter commented 9 years ago
I can certainly add a warning about this. It's definitely important to make 
this clear with the char field as it's the opposite behaviour to the varchar 
field (where no length means infinite), it may also apply with other types too.

Under what circumstances should the warning appear, and what should it say?

Original comment by marxjohn...@gmail.com on 3 Dec 2014 at 11:11

GoogleCodeExporter commented 9 years ago
At a minimum, a warning should appear if changing a field type will result in 
someone losing data, because the field size will be truncated as a result of 
the change. (Ideally, ORDS would be able to check and see if this will actually 
happen - i.e. whether the field in question actually does contain values that 
are more than one character long, but if that's not possible/too complicated 
I'd settle for a generic warning that prompts people to check before making the 
change.)

The warning needs to say something like 'Changing the field type will/may 
result in loss of data. Are you sure you want to proceed?' If it's feasible to 
have different warning messages for different situations, we might want to fill 
that out to explain exactly what's going on (and how to avoid it - e.g. by 
adding a size or selecting another field type), though there's a balance to be 
struck between giving enough information and ending up with something that's 
too wordy! 

If the warning text can go in the lang.properties file, I can tweak it later 
once I had a chance to think about it further.

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 12:07

GoogleCodeExporter commented 9 years ago
Having thought about it very slightly further, I'm now thinking that 'Changing 
the field type may/will result in entries in this field being truncated. Are 
you sure you want to proceed?' might be clearer. (Whether the message should 
say 'will' or 'may' depends on whether ORDS can tell whether data is actually 
being lost, or if it's just a risk.)

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 12:11

GoogleCodeExporter commented 9 years ago
It is possible to check whether data in the column is longer than the newly 
specified length, but it is a fairly complicated step at add.  For now, I'll 
add a warning regardless so we've got something, and if we want I can make it 
actually check at a later date.

The condition I'll go for for showing the warning is if they change the data 
type, or if they decrease/leave empty the length (with the exception of leaving 
it empty for varchar).

Original comment by marxjohn...@gmail.com on 3 Dec 2014 at 1:44

GoogleCodeExporter commented 9 years ago
If the message will be shown _every_ time the data type is changed, we need to 
rethink the text. Someone changing the data type from (e.g.) Varchar to Text is 
not going to be at risk of losing anything, and it's probably overly alarmist 
if we imply that they are. (This carries two risks: first, that we worry people 
when there's no need, and secondly, that if people see the message a few times 
in situations where data loss isn't a risk, they start ignoring it and then end 
up losing data as a result.

Are there data types other the Char where leaving the size field empty is risky?

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 1:49

GoogleCodeExporter commented 9 years ago
I dont think leaving it empty carries a risk in any other cases actually. 
However, reducing it does.

So perhaps display the warning whenever it's reduced or left blank, except for 
varchar being left blank?

Original comment by marxjohn...@gmail.com on 3 Dec 2014 at 2:09

GoogleCodeExporter commented 9 years ago
I think I'm missing something here. If leaving it blank doesn't carry a
risk for any data types except Char, why do we need to warn people if they
leave it blank for everything except Varchar? (Warning people when they
reduce the size is definitely a good idea, though.)

Am I right in thinking there are some changes of data type where just the
change can result in data loss, regardless of whether a size is specified?
E.g. changing from Decimal to Integer would presumably remove anything
after a decimal point?

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 2:13

GoogleCodeExporter commented 9 years ago
Sorry, I'm getting myself confused here.

Leaving char blank appears to be the only circumstance where leaving the length 
blank results in data loss.  With other types reducing the length can result in 
data loss.

You are right that some type changes, such as Decimal to Integer, can also 
result in data loss.  Other type changes where the data is incompatible, such 
as changing text to integer when a record contains letters, will result in an 
error instead.

Original comment by marxjohn...@gmail.com on 3 Dec 2014 at 2:24

GoogleCodeExporter commented 9 years ago
So it sounds as though we need to warn people:

1. If they change the data type to Char and leave the size field blank 
2. If they reduce the value in the size field
3. If they change the data type in a way that may result in data loss

Is it feasible to have a different warning message for each of the above, so we 
can include a bit more explanation about what's going on?

For 3., is it feasible to just display the warning message if the change is one 
that might result in data loss - so, for example, attempting to change from 
Decimal to Integer would trigger the warning, but the reverse wouldn't?

There's also a potential 4. - the case where they change the data type to Char, 
and then put a number in the size field that's small enough to truncate what's 
already there. However, it sounds as though detecting that may be complicated. 
Maybe changing the data type to Char should always generate a warning message? 
(Though ideally a different one depending on whether the size field is blank or 
not.)

I can see that checking whether the content of a field is longer than the size 
limit would be a complicated thing to add. Is it equally complicated to check 
whether there's any data at all in a table? I'm just thinking of the situation 
where someone is designing a database and playing about with options before 
having actually put any data in any of the tables. If losing data is a real 
possibility, you'd definitely want to be warned about that, but it could be 
irritating to be warned about possible data loss when you haven't entered any 
at all data yet! However, if it's not straightforward to distinguish between 
the two, we may just have to live with that as the lesser evil.

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 2:45

GoogleCodeExporter commented 9 years ago
I might have misunderstood, but I think that 4 is the same as 2 - regardless of 
whether the data type is changed or not, reducing the size may result in 
truncation.

3 is possible and not too complicated. As far as I know, the changes that might 
lose data are other numeric types to integer, and the date/time types to other 
types.

Original comment by marxjohn...@gmail.com on 3 Dec 2014 at 3:05

GoogleCodeExporter commented 9 years ago
The reason I had 4 separated out in my head as a separate case is that changing 
the value of a Char field from blank to anything else _increases_ it rather 
than reducing it. 

I was thinking that someone might increase the field size, but not increase it 
by enough to accommodate whatever data is already there. But thinking about it 
a bit further, this shouldn't actually ever be a problem. Either the field was 
already Char, in which case we don't need a warning, as they won't lose data 
because they're increasing the field size, or they're changing it to Char from 
something else, in which case a change from blank to something else is a 
reduction in field size, so would get the normal 2.-type warning.

For 3., it feels to me as though to be on the safe side, we need to draw a grid 
of field types and consider each from/to option to see if it might pose a 
problem. I might try to do that later today, if I have time.

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 3:27

GoogleCodeExporter commented 9 years ago
I've knocked up a table here: 
https://docs.google.com/spreadsheets/d/1KgrZgAoaZigBd0ePn-A4tEzdmLpChrTgxarXKr3v
GY0/edit#gid=0

However, as you'll see, there are a fair few options I'm uncertain about 
(particularly what happens when converting from/to binary and Boolean types). 
It's also possible I've got some of the others wrong, so please do run an eye 
over it and check!

I notice that above you say changing from date/time types to other types might 
result in data loss? Why is that? I'd expect either an error (if you tried to 
change from, say, date to integer), or for it just to convert the date to text 
- is that not what happens?

Original comment by meriel.p...@gmail.com on 3 Dec 2014 at 5:56

GoogleCodeExporter commented 9 years ago
Regarding the date/time types:
My initial thought was that changing from Timestamp to Date would lose the time 
data, or to Time would lose the date.

You can also change to other types such as integer and text - this doesn't 
actually lose data but does change the way it's represented. E.g. changing to 
text will give you a string formatted YYYY-MM-DD hh:mm:ss, which changing to 
integer will give you a unix timestamp[1]

[1] https://en.wikipedia.org/wiki/Unix_time

Original comment by marxjohn...@gmail.com on 4 Dec 2014 at 9:27

GoogleCodeExporter commented 9 years ago
I've updated your spreadsheet regarding Booleans.
In summary, you can change from Boolean to Integer, Bigint, char, varchar and 
text, but not to anything else.
You can change from these types back to Boolean, but only if the data only 
contains a boolean-like value (1/0, t/f, true/false)

Regarding BYTEA ("Binary Data", but looking at it again we should really call 
it "Binary String"), the situation is a bit more complicated as more complex 
conversion is required to make any change possible.  To be honest, I'm not so 
convinced that it's a particularly useful data type for us to have available - 
it's designed for storing strings that contain non-printable control 
characters, which is a bit of a niche requirement, and would also be served by 
BLOB, which can store more useful stuff like image files.  I'll bring this up 
at next week's meeting.

Original comment by marxjohn...@gmail.com on 4 Dec 2014 at 11:28

GoogleCodeExporter commented 9 years ago
To summarise/confirm what we've said so far:
* We need confirmation message 1 to appear when converting to a char and 
leaving the size blank
* We need confirmation message 2 to appear when reducing the size of a field
* We need confirmation message 3 to appear when performing and change indicated 
with a "Y" in the spreadsheet.

Original comment by marxjohn...@gmail.com on 4 Dec 2014 at 11:47

GoogleCodeExporter commented 9 years ago
That sounds right to me. I've been trying to work out if there are any other 
potentially risky changes, but haven't identified any yet.

Original comment by meriel.p...@gmail.com on 5 Dec 2014 at 4:26

GoogleCodeExporter commented 9 years ago
Right, I've added the logic so that messages will appear in those 
circumstances.  The messages themselves are in lang.properties awaiting your 
wordsmithery; datatypewarning1, sddatatypewarning2 and sddatatypewarning3.

Original comment by marxjohn...@gmail.com on 8 Dec 2014 at 9:53

GoogleCodeExporter commented 9 years ago

Original comment by marxjohn...@gmail.com on 9 Dec 2014 at 9:19