scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
https://scrapy.org
BSD 3-Clause "New" or "Revised" License
51.16k stars 10.35k forks source link

Adding support for Path objects to APIs that take paths #5739

Closed wRAR closed 3 weeks ago

wRAR commented 1 year ago

As mentioned in #5682, we may have some public APIs that take paths and the implementations should now use Path objects so it should be trivial to also accept Path objects in these APIs.

I think the first step is actually identifying such APIs.

kmike commented 1 year ago

I think we don't have to solve it all at once, PRs with partial implementation are also welcome.

wRAR commented 1 year ago

True.

harshit496 commented 1 year ago

@wRAR @kmike would like to take a first stab at this if it is not already resolved. Thanks!

wRAR commented 1 year ago

@harshit496 it's not resolved, as you can see in the issue status.

harshit496 commented 1 year ago

@wRAR I tried searching the usages of os.path in the whole code base and couldn't find any.

@harshit496 ➜ /workspaces/scrapy (master) $ grep -rnw . -e 'from os'
./tests/test_feedexport.py:17:from os import PathLike
./build/lib/scrapy/utils/template.py:3:from os import PathLike
./build/lib/scrapy/commands/runspider.py:2:from os import PathLike
./build/lib/scrapy/squeues.py:7:from os import PathLike
./scrapy/utils/template.py:3:from os import PathLike
./scrapy/commands/runspider.py:2:from os import PathLike
./scrapy/squeues.py:7:from os import PathLike
@harshit496 ➜ /workspaces/scrapy (master) $ grep -rnw . -e 'from os import path'
@harshit496 ➜ /workspaces/scrapy (master) $ grep -rnw . -e 'os.path.abspath()'
@harshit496 ➜ /workspaces/scrapy (master) $ 
Gallaecio commented 1 year ago

Yes, finding those APIs is probably not that easy.

One way may be to look at the settings, identify which ones are about paths, and see how they are handled.

Mind also cases where import os is used, and then os.path.

wRAR commented 1 year ago

Not sure if searching for os.path usages is helpful here, as we are talking about APIs that take a string and then pass it to pathlib.Path. Actually we shouldn't even have os.path usages after the pathlib conversion.

alexpdev commented 1 year ago

I just submitted PR #5801 which patches one instance of a Path object raising an exception.

ggold7046 commented 10 months ago

Hello, I'm new to opensource. Could anyone guide me in this issue ? Do you have any slack or discord channel ?

wRAR commented 10 months ago

@ggold7046 see https://scrapy.org/community/

DevPatel1023 commented 7 months ago

@wRAR Hey i am beginner so where do i start to contribute and can you please guide me ?

wRAR commented 7 months ago

@DevPatel1023 are you asking about this specific issue or in general?

DevPatel1023 commented 7 months ago

@wRAR No,In genral this is my first open source contribution which I am doing and I have knowledge of basics of python so applying the concepts , growing my skills and helpful for contribution, where do I start in it can you please guide me .

wRAR commented 7 months ago

@DevPatel1023 https://github.com/scrapy/scrapy#contributing

DevPatel1023 commented 7 months ago

@wRAR Thank you

cakemd commented 6 months ago

Hello, @wRAR

I'm new to a project and I'd be happy to make some useful patches.

I was thinking about to start with adding support of Path in scrapy.utils.project.data_path function. Not sure it fits in term "public api", but it's used for reading settings which can be changed publically, if I'm not mistaking.

I see, the string argument is used for making a Path object inside the function. So, if it's a case for "public api", does it make sense to update the function's "path" param declaration there?

Screenshot 2023-11-01 at 17 34 31 Screenshot 2023-11-01 at 17 51 08
wRAR commented 6 months ago

@cakemd yes, I think it's useful and makes sense to change the type hint there.

delaneyscofield commented 3 weeks ago

Does this issue still need to be fixed or has it been resolved?

wRAR commented 3 weeks ago

Good question. I think I'll close this as it proved hard to find more things to fix here.