zytedata / zyte-common-items

Contains the common item definitions used in Zyte.
BSD 3-Clause "New" or "Revised" License
9 stars 6 forks source link

Always trim string fields #101

Open Nykakin opened 2 months ago

Nykakin commented 2 months ago

Always trim string fields.

For all fields of type Optional[str] or str apply str.strip() through processors.

For all fields of type Optional[List[str]] or List[str] apply str.strip() to all elements through processors.

Extract processors to separate classes following approach found in extractors to reduce duplication in code.

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 0% with 158 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (cce3b94) to head (2800ff8). Report is 6 commits behind head on main.

Files Patch % Lines
zyte_common_items/pages/product.py 0.00% 25 Missing :warning:
zyte_common_items/pages/job_posting.py 0.00% 21 Missing :warning:
zyte_common_items/pages/real_estate.py 0.00% 19 Missing :warning:
zyte_common_items/pages/business_place.py 0.00% 18 Missing :warning:
zyte_common_items/pages/article.py 0.00% 17 Missing :warning:
zyte_common_items/pages/social_media_post.py 0.00% 10 Missing :warning:
zyte_common_items/processors.py 0.00% 10 Missing :warning:
zyte_common_items/pages/product_list.py 0.00% 9 Missing :warning:
zyte_common_items/pages/product_navigation.py 0.00% 9 Missing :warning:
zyte_common_items/pages/article_list.py 0.00% 8 Missing :warning:
... and 2 more
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #101 +/- ## ====================================== Coverage 0.00% 0.00% ====================================== Files 54 55 +1 Lines 2107 2224 +117 ====================================== - Misses 2107 2224 +117 ``` | [Files](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata) | Coverage Δ | | |---|---|---| | [zyte\_common\_items/pages/base.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Fbase.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvYmFzZS5weQ==) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/article\_navigation.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Farticle_navigation.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvYXJ0aWNsZV9uYXZpZ2F0aW9uLnB5) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/article\_list.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Farticle_list.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvYXJ0aWNsZV9saXN0LnB5) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/product\_list.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Fproduct_list.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvcHJvZHVjdF9saXN0LnB5) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/product\_navigation.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Fproduct_navigation.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvcHJvZHVjdF9uYXZpZ2F0aW9uLnB5) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/social\_media\_post.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Fsocial_media_post.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvc29jaWFsX21lZGlhX3Bvc3QucHk=) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/processors.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fprocessors.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcHJvY2Vzc29ycy5weQ==) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/article.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Farticle.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvYXJ0aWNsZS5weQ==) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/business\_place.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Fbusiness_place.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvYnVzaW5lc3NfcGxhY2UucHk=) | `0.00% <0.00%> (ø)` | | | [zyte\_common\_items/pages/real\_estate.py](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree&filepath=zyte_common_items%2Fpages%2Freal_estate.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata#diff-enl0ZV9jb21tb25faXRlbXMvcGFnZXMvcmVhbF9lc3RhdGUucHk=) | `0.00% <0.00%> (ø)` | | | ... and [2 more](https://app.codecov.io/gh/zytedata/zyte-common-items/pull/101?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zytedata) | |
kmike commented 1 month ago

I think that's fine to strip string fields, but we need to change the documentation.