Closed cuducos closed 7 years ago
@cuducos, answering the questions:
1- If this is supported right now, then the type identification algorithm will not know when it's an integer or float, leading to some problems. So, by now your way is the best one but I'd replace value = float(value)
with value = int(float(value))
since it's expected an int
object from this deserialize
method.
2- Yes! You can force the INPUT_FORMAT
attribute or even create a new class (I prefer the second option):
import rows
# First method
old_format = rows.fields.DateField.INPUT_FORMAT
rows.fields.DateField.INPUT_FORMAT = '%Y-%m-%dT%H:%M:%S'
value = rows.fields.DateField.deserialize('2014-02-16T00:00:00')
rows.fields.DateField.INPUT_FORMAT = old_format
#Second method
class MyDateField(rows.fields.DateField):
INPUT_FORMAT = '%Y-%m-%dT%H:%M:%S'
value = MyDateField.deserialize('2014-02-16T00:00:00')
3- The current way of doing this is using rows.transform
(you can see an example here) but I'm not sure this is the best solution for these cases (still need to find a better API). The use case of renaming/removing fields is very common, so I'm thinking in a more straightforward way of implementing this on the library. This question is also related to this issue: https://github.com/turicas/rows/issues/30 (it may be a better way of having the conversion you want, but it's not "officially" supported by the library).
This weekend I gave this PR another try and the results were quite good when combined with Celery. The process of serializing data is basically done by rows
, which is pretty fast compared to what we had before (30% faster in my local tests, cc @decko).
The PR includes two new libraries when compared to master
, so in order to test it one have to temporary change the docker-compose.yml
to use the Dockerfile
:
tasks:
- image: datasciencebr/jarbas-backend
+ build:
+ context: .
environment:
After this change just docker-compose build
and then proceeds to the regular tests (in this case, importing reimbursements). If this is a merge, just revert the temporary changes to docker-compose.yml
; )
Hi @cuducos @turicas !
Nice that we will have rows in our project, great project, and the collaboration is amazing. I'll test that PR my way, but already 👍 !
Clone the project:
$ git clone git@github.com:datasciencebr/jarbas.git
Change to Jarbas folder:
$ cd jarbas
Checked out to @cuducos PR:
$ git checkout -b cuducos-rows origin/cuducos-rows
Merge with the master content:
$ git merge master
Run the project:
$ docker-compose up -d
Change the docker-compose.yml
file as @cuducos says in his last message
docker-compose build
again:
$ docker-compose build
I think that the best way to test if it is working is by running the tests.
$ docker-compose run --rm django python manage.py check
$ docker-compose run --rm django python manage.py test
Result:
➜ jarbas git:(cuducos-rows) ✗ docker-compose run --rm django python manage.py test
Starting jarbas_queue_1 ...
Starting jarbas_postgres_1 ... done
Starting jarbas_queue_1 ... done
Starting jarbas_tasks_1 ... done
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
.........................................................................................................................................................................
----------------------------------------------------------------------
Ran 169 tests in 5.083s
OK
Destroying test database for alias 'default'...
Ok, got it, works fine, merging it!
This PR is part of the inclusion of
rows
as suggested in #211, starting with the process of importing reimbursements to Postgres.Not sure if it's the best implementation so labeled it as an work in progress and I'd like to invite @turicas for a code review. In general my doubts are:
Is there a cleaver way to force conversion of data such as
'2011.0'
toint
?https://github.com/datasciencebr/jarbas/blob/7d030103f5c67dea214454173f22f16688a28497/jarbas/core/fields.py#L7-L16
Is there a cleaver way to force conversion of data such as
'2014-02-16T00:00:00'
todate
?https://github.com/datasciencebr/jarbas/blob/7d030103f5c67dea214454173f22f16688a28497/jarbas/core/fields.py#L19-L29
Can
rows
rename columns? Actually I'm converting each row (namedtuple
) to aOrderedDict
, then to adict
and them using this method:https://github.com/datasciencebr/jarbas/blob/7d030103f5c67dea214454173f22f16688a28497/jarbas/core/management/commands/reimbursements.py#L53-L63
Thanks @priscilaportela for pairing!