Closed an1018 closed 3 years ago
Thanks for this PR. Here are some general questions/thoughts:
black
and pylint
for checking. preprocess.py
and layoutmodel.py
file
preprocess
function should live within layoutparser
rather than paddledetection
? I think it's paddledetection
's obligation to do the preprocessing for the models. test_model
. Thanks for this PR. Here are some general questions/thoughts:
* Please remove the OCR part from this PR. * Please use PEP8 for as the variable naming conventions -- you could use `black` and `pylint` for checking. * For the `preprocess.py` and `layoutmodel.py` file 1. I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity? 2. And more fundamentally, why the `preprocess` function should live within `layoutparser` rather than `paddledetection`? I think it's `paddledetection`'s obligation to do the preprocessing for the models. * Also could you add tests for the paddle modules? You could follow the examples in [`test_model`](https://github.com/Layout-Parser/layout-parser/blob/master/tests/test_model.py).
Thanks for your replay, I'll modify it later!
Thanks for this PR. Here are some general questions/thoughts:
- Please remove the OCR part from this PR.
- Please use PEP8 for as the variable naming conventions -- you could use
black
andpylint
for checking.For the
preprocess.py
andlayoutmodel.py
file
- I think it contains a lot of unnecessarily complicated design -- can we just simply the functions with minal complexity?
- And more fundamentally, why the
preprocess
function should live withinlayoutparser
rather thanpaddledetection
? I think it'spaddledetection
's obligation to do the preprocessing for the models.- Also could you add tests for the paddle modules? You could follow the examples in
test_model
.
Hi, For the preprocess.py
and layoutmodel.py
file,
Hi,I've modied the code, including:
remove the OCR part
use pylint
for checking
For the init.py and catalog.py file,these two files are less modified, so they are not checked
add tests for the paddle modules test_model
Thanks for the updates! The current version is better than the previous one, but I think the code could be further simplified and more structured -- please check the comments for possible updates.
I've also carefully checked the code, and added some changes. Some key updates:
layoutmodel.py
, removing unnecessary variables and definitions. Please note, you mentioned in the previous version that paddle mode may support specifying batch_size
during inference, however that variable was never used.
And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is \~2MB/s in US, taking around 2\~3 minutes to download the whole model, which might be suboptimal.
Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!
I've also carefully checked the code, and added some changes. Some key updates:
- improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
- further simplify the
layoutmodel.py
, removing unnecessary variables and definitions.Please note, you mentioned in the previous version that paddle mode may support specifying
batch_size
during inference, however that variable was never used.And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~2MB/s in US, taking around 2~3 minutes to download the whole model, which might be suboptimal.
Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!
Hi, thanks for your new changes, which further simplifies the code.
The changes are ok for me,too.
Through this repo and the whole modification process, I also learned a lot. Thank you very much!
I've also carefully checked the code, and added some changes. Some key updates:
- improve the downloading logic -- it will remove the tar file when downloaded, and check the existence of the untar'ed files.
- further simplify the
layoutmodel.py
, removing unnecessary variables and definitions.Please note, you mentioned in the previous version that paddle mode may support specifying
batch_size
during inference, however that variable was never used. And in the future we might need to move the paddle model to dropbox/aws as the downloading speed is ~2MB/s in US, taking around 2~3 minutes to download the whole model, which might be suboptimal. Otherwise I think this PR is ready to be merged - please take a quick look and let me know if the new changes are OK. Thanks for the great work!Hi, thanks for your new changes, which further simplifies the code.
- For the model download source, it is ok to support dropbox/aws, but we hope baidu source is the prior choice since there are also many developers in China.
- Batch size is not used in the code, which is just designed for the expansion, thanks for your check.
- The changes are ok for me, thanks for your careful review again!
I think ultimately we'll provide better downloading support, e.g., mirrored downloading sites and or something mentioned in #43 . At that time, ppl from different regions can choose the best downloading methods accordingly.
Hi, I have reorganized the file structure of Paddle. Please review.Thx