hardikp / scrapy-finance

[OUTDATED] scrapy spiders to crawl the financial text data :books: :scroll: pertinent to train word vectors :rocket:
MIT License
24 stars 10 forks source link

How to add the date? #1

Open daventhedude opened 6 years ago

daventhedude commented 6 years ago

Hey, first I gotta say that I like your work on Github! Could you tell me how to also fetch the date and other information on the articles (especially on bloomberg)?

Further I would be delighted if you could tell me how to set the start_date of crawling.

Many thanks in advance!

hardikp commented 6 years ago

Unfortunately, it seems that Bloomberg doesn't work anymore. Not sure if there is a workaround. Other crawlers seem to be working fine.

In general, you can try to look at the response using the shell command of scrapy:

scrapy shell https://www.investopedia.com/terms/b/block-positioner.asp

This will fetch the specific URL and create an ipython shell that you can use to explore the response - which can be used to parse the date/time.