EnterpriseDB / mongo_fdw

PostgreSQL foreign data wrapper for MongoDB
GNU Lesser General Public License v3.0
326 stars 70 forks source link

floating points #95

Open rotten opened 6 years ago

rotten commented 6 years ago

I have a few documents in mongo with a mix of integers and floating point numbers in them.

If I set the column type in the foreign table to "numeric", the integer values disappear from the results. If I set the column type to "integer", the floating point numbers disappear. If I set the column type to "varchar" all the numbers disappear. How do I cast those integer values to floating point in the FDW without going into mongo and changing them?

(I'm using the latest code from the repo against PostgreSQL 10 and MongoDB 3.4 on Ubuntu 16.04.)

rotten commented 6 years ago

The data is being written into MongoDB using Mongoose which only has a "Number" type. It does not have NumberDecimal. The number type sets all exact values to integer. In our use case we can live with truncating all of the floating point values to integer, so I did that to the entire data set. And then we updated our front-end software to only accept integers. I'm not sure what we would have done if we had to go the other way - coerce everything to a floating point type - because Mongoose would still be putting integers in there when someone enters 4.00, for example. Ideally the FDW would be able to do a type conversion or cast when a mixed number format like this is encountered.

rotten commented 6 years ago

This same problem happens if you have a collection with a column of mixed "numberInt" and "numberLong" types. Setting the column type in the Foreign Table to "integer" will throw out the longs and replace the values with '0'.

This approach converts all of an object's type to the same data type so that you can use a consistent data type in the foreign table. It doesn't fix the problems with whatever tools are inserting data into the collections thoug.

db.mycollection.find({mycolumn: {$exists: true}}).forEach(function(obj) {
    obj.mycolumn = new NumberInt(obj.column);
    db.mycollection.save(obj);
});
andreasscherbaum commented 5 years ago

What exactly do you mean by "values disappear"? Can you post examples?

sirus2003 commented 5 years ago

What exactly do you mean by "values disappear"? Can you post examples?

Hello.

It's very easy to catch this bug. I've created collection with two documents(id + numeric value). First document has value 10.5 as double. Second document has value 10 as int32.

When I try convert value to numeric in PG, my int value will become zero.

I attach a couple of screenshots.

Screenshot 2019-04-02 at 10 36 00 Screenshot 2019-04-02 at 10 37 00
efron commented 4 years ago

I am having the same problem as @sirus2003, int32 is becoming zero.

I am using the metadriver compilation, as that is the only one that accepts SSL.

epaolillo commented 3 years ago

Same here

mariosnic commented 3 years ago

same issue. This should be resolved the same way the mongo BI connector does in resolving type conflicts Sampling Type Conflicts

vaibhavdalvi93 commented 3 years ago

Thanks for reporting an issue. This is a bug and reproducible at my end.

When I try convert value to numeric in PG, my int value will become zero.

The FDW tries to retrieve the current field of type _BSON_TYPEDOUBLE using function BsonIterDouble() which calls mongo-c-driver API _bson_iterdouble(). This API handles only _BSON_TYPEDOUBLE but before that, while checking type compatibility (check if the given BSON type can be converted to the given PostgreSQL type) mongo_fdw allows compatibility between given NUMERIC PostgresSQL type and given _BSON_TYPEINT32 BSON type. Code snippet from function ColumnTypesCompatible():

case NUMERICOID:
if (bsonType == BSON_TYPE_INT32 || bsonType == BSON_TYPE_INT64 ||
bsonType == BSON_TYPE_DOUBLE)
compatibleTypes = true;

The function _bson_iter_double() only considers _BSON_TYPEDOUBLE type and returns zero for other types. Hence, resulting into zero for BSON type int32.

Solution: The alternate API bson_iter_as_double() can be used instead of bson_iter_double(). If field is of type _BSON_TYPEDOUBLE, this returns the double and if field is of type integer such as int32, int64, or bool, it will convert the value to a double.

Find the attached patch which implements above said solution. Please test at your side and let me if it works for you.

Resolve-type-compatibility-between-PostgreSQL-NUMERIC-and-bson-INT32.txt

mariosnic commented 3 years ago

I confirm that the patch mentioned by @vaibhavdalvi93 corrects the following scenario: Mongo collection has a field with both integers and doubles, and foreign table definition defines the column as numeric.

If however the column in the foreign table is defined as an integer, the foreign data wrapper will change all doubles to zeroes, irrespective if they have a fractional component or not.

It also does not cover the scenario where the foreign table has the column defined as text. In this case everything is returned as nulls, unless the field has a value defined as string/text.

The ideal solution in this scenario would be to allow all values, whether double, integer, boolean, date or text to be returned as their textual representation and not nulls (Ideally a foreign column defined as text should always return a value, the textual representation of the contents of the field irrespective of data type).

Type casting should work in a similar way that BI connector resolves type conflicts.

image

vaibhavdalvi93 commented 3 years ago

I confirm that the patch mentioned by @vaibhavdalvi93 corrects the following scenario: Mongo collection has a field with both integers and doubles, and foreign table definition defines the column as numeric.

Thanks, @mariosnic for confirmation on it.

If however the column in the foreign table is defined as an integer, the foreign data wrapper will change all doubles to zeroes, irrespective if they have a fractional component or not.

This can be resolved by using mongo-c API bson_iter_as_int64 instead of using bson_iter_int32 and bson_iter_int64. If foreign table column is NOT int64 then it will try to convert the value to an int64. Such filed type include:

Find a revised patch that resolves the above two issues:

Resolve-type-compatibility-between-PostgreSQL-and-MongoDB-v2.txt

It also does not cover the scenario where the foreign table has the column defined as text. In this case everything is returned as nulls, unless the field has a value defined as string/text.

Thanks for the suggestion. I think this requires a more detailed study. We will try to work on this in near future.

mariosnic commented 3 years ago

@vaibhavdalvi93, Sorry for not replying sooner.

I have also tested the integer type casting modification, which does convert bool, double and int32 to integers.

So I can now confirm that both the double and and int patches are working. Only remaining point to look into is returning any underlying value as its string representation irrespective of data type whenever the foreign table column is defined as string/text.

Thanks for your support on these two patches.

vaibhavdalvi93 commented 2 years ago

Hi @rotten and @efron,

This issue is been resolved. Can you please verify from your end and can you close the ticket if it's been resolved for you.

rotten commented 2 years ago

Thanks for fixing this! Unfortunately I am no longer using MongoDB. I can't easily verify it. I'll trust that you got it!