ietf-tools / relaton-data-ieee

3 stars 5 forks source link

fix: ieee-rawbib fetcher #24

Closed stefanomunarini closed 8 months ago

stefanomunarini commented 9 months ago

@ronaldtse (cc @andrew2net) As I do not have access to this repository secrets, can you please check (or point me to someone I can ask to) if amongst the secrets of this repository exists a IETF_BIB_BOT_PAT token? And, in particular, if this token is valid to clone the private repository https://github.com/ietf-ribose/ieee-rawbib ?

As can be seen in the GHA logs for this PR, the token currently used in the Run bundle exec ruby crawler.rb step seems like to not have enough permissions to clone the above mentioned repository.

stefanomunarini commented 9 months ago

20

stefanomunarini commented 8 months ago

It seems like this secret has been removed from this project.

I added a "test" step to check if secrets.IETF_BIB_BOT_PAT is present among the repository secrets (which replicate previous implementation behaviour, see https://github.com/ietf-tools/relaton-data-ieee/pull/24/files#diff-b2195247a7dfa266381b77e7cbef41ca7cbf143fd06128db55a71de7043507d0R13-R22).

@CAMOBAP we might not need to use username:password (as I was suggesting here), but simply provide this secret and clone the repository using the canonical system("git clone https://oauth2:#{token}@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib").

Alex, could you point me at someone who has access to this repository settings?

CAMOBAP commented 8 months ago

@stefanomunarini IDK, @ronaldtse maybe you can suggest a right person

stefanomunarini commented 8 months ago

Hi @kesara ,

It seems like this secret has been removed from this project.

I added a "test" step to check if secrets.IETF_BIB_BOT_PAT is present among the repository secrets (which replicate previous implementation behaviour, see https://github.com/ietf-tools/relaton-data-ieee/pull/24/files#diff-b2195247a7dfa266381b77e7cbef41ca7cbf143fd06128db55a71de7043507d0R13-R22).

@CAMOBAP we might not need to use username:password (as I was suggesting here), but simply provide this secret and clone the repository using the canonical system("git clone https://oauth2:#{token}@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib").

Alex, could you point me at someone who has access to this repository settings?

@kesara as outlined above, this repository is missing a secret, AKA secrets.IETF_BIB_BOT_PAT. Could you please check the settings or point me at someone that has access to this repo settings, to verify that this secret is set? And, if not, help providing it? Thanks

kesara commented 8 months ago

Hi @kesara ,

It seems like this secret has been removed from this project. I added a "test" step to check if secrets.IETF_BIB_BOT_PAT is present among the repository secrets (which replicate previous implementation behaviour, see https://github.com/ietf-tools/relaton-data-ieee/pull/24/files#diff-b2195247a7dfa266381b77e7cbef41ca7cbf143fd06128db55a71de7043507d0R13-R22). @CAMOBAP we might not need to use username:password (as I was suggesting here), but simply provide this secret and clone the repository using the canonical system("git clone https://oauth2:#{token}@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib"). Alex, could you point me at someone who has access to this repository settings?

@kesara as outlined above, this repository is missing a secret, AKA secrets.IETF_BIB_BOT_PAT. Could you please check the settings or point me at someone that has access to this repo settings, to verify that this secret is set? And, if not, help providing it? Thanks

This is set. It's a personal access token for a bot account that has access to a private repository. Earlier versions of the GHA used this to access the ieee-rawbib repository. I guess the best way to test this is just mock it with a different private repository. See https://github.com/ietf-tools/relaton-data-ieee/blob/5a30bfed1905c73ddcf61136f6413bac6d904d8f/.github/workflows/generate.yml

kesara commented 8 months ago

The secrets are not passed to workflows that are triggered by a pull request from a fork. So the PR tests will continue to fail.

stefanomunarini commented 8 months ago

The secrets are not passed to workflows that are triggered by a pull request from a fork. So the PR tests will continue to fail.

I see, this was the issue then!

Would it be possible to enable passing secrets to workflows that are triggered by a PR? This is only to test the implementation before merging the PR, it could be disabled right after testing is completed.

@kesara

ronaldtse commented 8 months ago

@stefanomunarini I don't think GitHub supports passing secrets to forks due to security concerns, for example, if a public user can create a PR, that means they can send the secret elsewhere eg via a remote call.

stefanomunarini commented 8 months ago

@stefanomunarini I don't think GitHub supports passing secrets to forks due to security concerns, for example, if a public user can create a PR, that means they can send the secret elsewhere eg via a remote call.

Right, makes total sense. I wonder if the only way to test this implementation is merging it (in this case, can you @CAMOBAP review this PR, please?)? Is there no other way otherwise?

ronaldtse commented 8 months ago

I don't think there's a common alternative. @CAMOBAP can you please help check, thanks!

kesara commented 8 months ago

When this PR is ready, I'm happy to merge and I can revert the change if it doesn't work.

stefanomunarini commented 8 months ago

Hi @kesara , we are waiting for final review from @CAMOBAP . After he's done with it, this PR can be merged. Thanks

CAMOBAP commented 8 months ago

Hi @kesara , we are waiting for final review from @CAMOBAP . After he's done with it, this PR can be merged. Thanks

@stefanomunarini if PR is ready for review please press "Ready for review" because usually "draft" PRs mean that the work is in progress and it doesn't ready for review

BTW there is no objections from my side, once workflow actually works

stefanomunarini commented 8 months ago

Hi @kesara , we are waiting for final review from @CAMOBAP . After he's done with it, this PR can be merged. Thanks

@stefanomunarini if PR is ready for review please press "Ready for review" because usually "draft" PRs mean that the work is in progress and it doesn't ready for review

You're right, that was an oversight from my side. Sorry about that

@kesara feel free to merge the PR, thanks

CAMOBAP commented 8 months ago

@stefanomunarini should we rerun the CI because last message was "fatal: Authentication failed for 'https://github.com/ietf-ribose/ieee-rawbib.git/'"

stefanomunarini commented 8 months ago

@stefanomunarini should we rerun the CI because last message was "fatal: Authentication failed for 'https://github.com/ietf-ribose/ieee-rawbib.git/'"

@CAMOBAP This issue has been discussed above, in particular Ronal mentioned that:

@stefanomunarini I don't think GitHub supports passing secrets to forks due to security concerns, for example, if a public user can create a PR, that means they can send the secret elsewhere eg via a remote call.

thus the authentication failed message.

Hence, we need to merge this PR in order to be able to test it. Kesara offered to roll it back immediately in the event of a CI failure.

kesara commented 8 months ago

GHA is running here: https://github.com/ietf-tools/relaton-data-ieee/actions/runs/6754213519/job/18361482188 There are errors but GHA is not complaining yet.

kesara commented 8 months ago

GHA is running here: https://github.com/ietf-tools/relaton-data-ieee/actions/runs/6754213519/job/18361482188 There are errors but GHA is not complaining yet.

And success. Can someone confirm the errors in Run bundle exec ruby crawler.rb *** are expected?

stefanomunarini commented 8 months ago

@CAMOBAP @andrew2net can you have a look at the comment above? Is this expected or is there anything that needs to be fixed?

CAMOBAP commented 8 months ago

@stefanomunarini crawler.rb looks right to me (probaby we can remove duration calculation logic out of it because fetch should already do this)

@andrew2net to me it looks like we have some issue in relaton-ieee (it looks like `country' is missing) could you please have a look?

File: ieee-rawbib/updates.20230831/IEEEUpdates_IEEEstd/week27b/3185.zip
undefined method `text' for nil:NilClass
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:160:in `block (2 levels) in parse_contributor'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:235:in `block in each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `upto'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:155:in `each_with_object'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:155:in `block in parse_contributor'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:235:in `block in each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `upto'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/nokogiri-1.14.5-x86_64-linux/lib/nokogiri/xml/node_set.rb:234:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:153:in `map'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:153:in `parse_contributor'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:43:in `block in parse'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:43:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_parser.rb:43:in `parse'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:102:in `fetch_doc'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:62:in `block in fetch'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:57:in `each'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:57:in `fetch'
/opt/hostedtoolcache/Ruby/2.7.8/x64/lib/ruby/gems/2.7.0/gems/relaton-ieee-1.14.8/lib/relaton_ieee/data_fetcher.rb:47:in `fetch'
crawler.rb:13:in `<main>'
andrew2net commented 8 months ago

File: ieee-rawbib/updates.20230831/IEEEUpdates_IEEEstd/week27b/3185.zip

@CAMOBAP The last files from 2023 caused the error because address country not found in them. But I'm unable to get and explore these files from the ietf-tools/ieee-rawbib repo. Is there any way to get one of the files using GHA?

CAMOBAP commented 8 months ago

@andrew2net I think the right way will be to ask @stefanomunarini @kesara to provide required files directly (because they are in the private repo)

@stefanomunarini @kesara is it possible?

kesara commented 8 months ago

@andrew2net I think the right way will be to ask @stefanomunarini @kesara to provide required files directly (because they are in the private repo)

@stefanomunarini @kesara is it possible?

I just emailed the requested file.

ronaldtse commented 8 months ago

@kesara could you please help provide us access to the ieee-rawbib repository so that we can address the newly introduced parsing errors from the new files? Thanks!

andrew2net commented 8 months ago

File: ieee-rawbib/updates.20230831/IEEEUpdates_IEEEstd/week27b/3185.zip

The file contains an address without country:

    ...
    <address>
        <city>New York</city>
    </address>
    ...

@kesara @ronaldtse For the NY city we can add the USA country. All the other addresses also have US cities. It seems we can use USA as a default country, right?