adsabs / ADSIngestParser

Curation parser library
MIT License
0 stars 7 forks source link

Squashed commit of the following: #117

Closed seasidesparrow closed 3 months ago

seasidesparrow commented 3 months ago

commit e3dfe656ac62393ac655de29fde46859dba37e5b Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Fri Jul 19 13:28:39 2024 -0400

    modified:   adsingestp/parsers/jats.py
    modified:   tests/stubdata/output/mdpi_climate-11-00147.json
    modified:   tests/stubdata/output/mdpi_symmetry-15-00939.json

commit 4914c5a9835fd0ebb7f851b5ce09f2a47351fce7 Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 17 11:51:44 2024 -0400

updates pyproject
    modified:   pyproject.toml

commit 44cd801902a84fd41579dc5585bd3b24781da2e7 Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 17 10:52:47 2024 -0400

fixes affid issue
    modified:   adsingestp/parsers/jats.py

commit 146e4d13de2848bcab4e5ecf0346ce16fd98e94e Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 17 09:28:36 2024 -0400

almost got it right, still needs work
    modified:   adsingestp/parsers/base.py
    modified:   adsingestp/parsers/jats.py

commit ebd42ab60cb981916ce7d86034f94763cf537a5d Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 17 09:00:57 2024 -0400

    modified:   adsingestp/parsers/base.py

commit c469d621eb3a08a061350b5f745c0dd11b86e62a Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 17 08:51:15 2024 -0400

aff ids, still in progress
    modified:   adsingestp/parsers/base.py
    modified:   adsingestp/parsers/jats.py

commit acc40dc6f455366b22967d35318ca58a5f57706b Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 10 18:39:44 2024 -0400

updates roman num test with new jats parser
    modified:   adsingestp/parsers/jats.py
    modified:   tests/stubdata/output/jats_springer_roman_num_1.json

commit 69f4a78877aba5714c87afaf602e5c5a6a7b242f Merge: 529248b 7bf0ca9 Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 10 18:28:51 2024 -0400

Merge branch 'main' of github.com:adsabs/ADSIngestParser into update_idm_affilid.20240708

commit 529248b038ac537e5ccb730e438d38a4ead67e2d Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 10 18:20:24 2024 -0400

mod'ed wiley parser for i_d_m compatibility.
    modified:   adsingestp/parsers/wiley.py
    modified:   tests/stubdata/output/wiley_jgra_57392.json
    modified:   tests/stubdata/output/wiley_swe_21103.json
    modified:   tests/stubdata/output/wiley_swe_461.json
    modified:   tests/stubdata/output/wiley_swe_539.json

commit 237ce74933c91832e0cc0a33bcd35a01ef3320a1 Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Wed Jul 10 16:06:45 2024 -0400

improved affstring+affid handling, formatting
    modified:   adsingestp/parsers/jats.py
    deleted:    adsingestp/parsers/jtest.py
    deleted:    adsingestp/parsers/jtestaff.py
    deleted:    adsingestp/parsers/piffol.py
    new file:   tests/stubdata/output/els_roman_num_1.json
    new file:   tests/stubdata/output/els_roman_num_2.json
    modified:   tests/stubdata/output/jats_a+a_multiparagraph_abstract.json
    modified:   tests/stubdata/output/jats_a+a_subtitle.json
    modified:   tests/stubdata/output/jats_aip_amjph_90_286.json
    modified:   tests/stubdata/output/jats_aj_158_4_139.json
    modified:   tests/stubdata/output/jats_apj_859_2_101.json
    modified:   tests/stubdata/output/jats_apj_967_1_35.json
    modified:   tests/stubdata/output/jats_aps_phrvd_100_052015.json
    modified:   tests/stubdata/output/jats_aps_phrvx_12_021031.json
    modified:   tests/stubdata/output/jats_edp_aa_661_70.json
    modified:   tests/stubdata/output/jats_edp_jnwpu_40_96.json
    modified:   tests/stubdata/output/jats_iop_aj_162_1.json
    modified:   tests/stubdata/output/jats_iop_ansnn_12_2_025001.json
    modified:   tests/stubdata/output/jats_iop_apj_923_1_47.json
    modified:   tests/stubdata/output/jats_iop_jinst_17_05_P05009.json
    modified:   tests/stubdata/output/jats_iop_no_orcid_tag.json
    modified:   tests/stubdata/output/jats_iop_preprint_in_record.json
    modified:   tests/stubdata/output/jats_iucr_d-60-02355.json
    modified:   tests/stubdata/output/jats_iucr_d-75-00616.json
    modified:   tests/stubdata/output/jats_mnras_493_1_141.json
    modified:   tests/stubdata/output/jats_nature_41467_2023_Article_40261_nlm.json
    new file:   tests/stubdata/output/jats_nature_natas_tmp.json
    modified:   tests/stubdata/output/jats_nature_natsd_12_7375.json
    new file:   tests/stubdata/output/jats_nature_roman_num_1.json
    modified:   tests/stubdata/output/jats_phrvd_106_023001.json
    modified:   tests/stubdata/output/jats_pnas_1715554115.json
    new file:   tests/stubdata/output/jats_sci_376_521.json
    modified:   tests/stubdata/output/jats_spie_jmnmm_1.JMM.21.4.041407.json
    modified:   tests/stubdata/output/jats_spie_opten_1.OE.62.4.048103.json
    modified:   tests/stubdata/output/jats_spie_opten_1.OE.62.4.066101.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2663029.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2663066.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2663263.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2663387.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2663472.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2663687.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2664418.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2664959.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2665099.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2665113.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2665157.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2665696.json
    modified:   tests/stubdata/output/jats_spie_spie_12.2690579.json
    modified:   tests/stubdata/output/jats_springerEarly_ExA_s10686-023-09907-7.json
    modified:   tests/stubdata/output/jats_springer_AcMSn_s10409-023-23061-x.json
    modified:   tests/stubdata/output/jats_springer_AcMSn_s10409-023-23086-x.json
    modified:   tests/stubdata/output/jats_springer_AcMSn_s10409-023-23108-x.json
    modified:   tests/stubdata/output/jats_springer_EPJC_s10052-023-11699-1.json
    modified:   tests/stubdata/output/jats_springer_EPJC_s10052-023-11733-2.json
    modified:   tests/stubdata/output/jats_springer_JHEP_JHEP07_2023_200.json
    modified:   tests/stubdata/output/jats_springer_NatCo_s41467-023-40272-3.json
    modified:   tests/stubdata/output/jats_springer_Natur_s41598-023-38673-x.json
    modified:   tests/stubdata/output/jats_springer_SoPh_s11207-023-02231-5_mathtex.json
    modified:   tests/stubdata/output/jats_springer_ZaMP_s00033-023-02064-z.json
    modified:   tests/stubdata/output/jats_springer_cldy_84_1543.json
    modified:   tests/stubdata/output/jats_springer_jhep_2022_05_05.json
    new file:   tests/stubdata/output/jats_springer_roman_num_1.json
    modified:   tests/stubdata/output/mdpi_climate-11-00147.json
    modified:   tests/stubdata/output/mdpi_galaxies-11-00090.json
    modified:   tests/stubdata/output/mdpi_symmetry-15-00939.json
    modified:   tests/stubdata/output/mdpi_universe-08-00651.json
    modified:   tests/stubdata/output/nlm_tf_gapfd_116_38.json
    new file:   tests/stubdata/output/nlm_tf_roman_num_1.json
    modified:   tests/test_jats.py

commit 8d7d0954062adb46e24bd449227603b2f7a07893 Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Tue Jul 9 19:00:56 2024 -0400

bugfix -- each affstring was getting all ids
    modified:   adsingestp/parsers/jtest.py

commit f15b5e7972ccc473ae6d7a9d31db8e624acc832f Author: Matthew Templeton matthew.templeton@cfa.harvard.edu Date: Tue Jul 9 16:42:13 2024 -0400

Candidate replacement jats parser for external affids
    modified:   adsingestp/parsers/base.py
    new file:   adsingestp/parsers/jtest.py
    new file:   adsingestp/parsers/jtestaff.py
    new file:   adsingestp/parsers/piffol.py
    modified:   pyproject.toml
seasidesparrow commented 3 months ago

Adds feature requested in #104