nestauk / dap_aria_mapping

Mapping technology innovation to support The Advanced Research and Innovation Agency (ARIA)
MIT License
1 stars 0 forks source link

61 volume pipeline #65

Closed emily-bicks closed 1 year ago

emily-bicks commented 1 year ago

Description

Two main modifications in this PR Pipeline to calculate number of docs per topic/domain/area per year (runs in about 20 minutes): pipelines/app_tables/emergence_table.py

Builds emergence and alignment charts in horizon scanner analysis/app/pages/1_Horizon_Scanner

to run app: streamlit run analysis/app/Home.py

Fixes #61

beingkk commented 1 year ago

Oh, and I managed to run the app on my machine, no issues there.

It does looks like some of the time series are somewhat jagged - I guess because of duplicate topic names, so no action needed on that in this PR as far as I can tell.

beingkk commented 1 year ago

Sounds good!

On Tue, 14 Mar 2023 at 11:32, emily-bicks @.***> wrote:

@.**** commented on this pull request.

In dap_aria_mapping/pipeline/app_tables/emergence_table.py https://github.com/nestauk/dap_aria_mapping/pull/65#discussion_r1135403403 :

+from dap_aria_mapping import BUCKET_NAME, logger + +if name == "main": +

  • logger.info("Loading patent data")
  • patents_with_topics = get_patent_topics(tax = "cooccur", level = 3)
  • logger.info("Transforming patent dictionary to polars df")
  • patents_with_topics_df = pl.DataFrame(
  • pd.DataFrame.from_dict(patents_with_topics, orient='index'
  • ).T.unstack().dropna().reset_index(drop=True, level=1).to_frame().reset_index())
  • patents_with_topics_df.columns = ["id", "topic"]
  • logger.info("Loading patent date data")
  • patent_dates = pl.DataFrame(get_patents()).select(
  • ["publication_number", "publication_date"]

Hmm - maybe we should have this conversation on standup?

— Reply to this email directly, view it on GitHub https://github.com/nestauk/dap_aria_mapping/pull/65#discussion_r1135403403, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDWGHN4EODEYD5TCDGRAE3W4BJLNANCNFSM6AAAAAAVUZ3LXI . You are receiving this because your review was requested.Message ID: @.***>

-- Karlis Kanders https://www.nesta.org.uk/team/karlis-kanders/ (he/him) Senior Data Foresight Lead Nesta - Discovery Hub / Data Analytics Practice

--

58 Victoria Embankment London EC4Y 0DS

            www.nesta.org.uk 

http://www.nesta.org.uk/ | Twitter http://www.twitter.com/nesta_uk | LinkedIn http://www.linkedin.com/groups?gid=1868227&goback=%2Egdr_1274367066783_1 | Facebook http://www.facebook.com/pages/NESTA/116788428355432?v=wall&ref=sgm

If you no longer want to receive emails from Nesta, send an email to  @. @.>. This email and any attachments are confidential and may be subject to legal privilege. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately or by email to @. @.> and delete this message and any
copies from your computer and network. The views expressed in this email are those of the author and do not necessarily reflect the views of Nesta. Nesta is a company limited by guarantee and registered in England and Wales with company number 7706036 and charity number

  1. Registered as a charity in Scotland number SC042833. Registered office: 58 Victoria Embankment, London, EC4Y 0DS. 

--