ietf-tools / relaton-data-ieee

4 stars 5 forks source link

Why this repo clones itself? #20

Closed andrew2net closed 11 months ago

andrew2net commented 1 year ago

@stefanomunarini Before the commit this repo got documents from the private repo https://github.com/ietf-ribose/ieee-rawbib. This GHA step workflow checked out the source repo

    ...
      - name: Checkout source
         uses: actions/checkout@v2
         with:
           repository: ietf-ribose/ieee-rawbib
           token: ${{ secrets.IETF_BIB_BOT_PAT }}
           path: ieee-rawbib
    ...

then the cloned data was converted to Relaton format

    ...
      - name: Fetch documents
         run: |
           rm -rf data
           relaton fetch-data ieee-rawbib
    ...

In the commit a crawler is added, and the crawler clones the repo to itself.

...
system("git clone https://github.com/ietf-tools/relaton-data-ieee")
...

I think this is a mistake.

stefanomunarini commented 12 months ago

Hi @andrew2net , thanks for reporting this. Are you suggesting that crawler should be cloning https://github.com/ietf-ribose/ieee-rawbib instead?

E.g.

[...]
puts "Started at: #{t1}"

system("git clone https://github.com/ietf-ribose/ieee-rawbib")
FileUtils.rm_rf("data")
FileUtils.cp_r("relaton-data-ieee/data", ".")
[...]

In this case we would probably have to ask ietf-tools to fork https://github.com/ietf-ribose/ieee-rawbib so that cloning can be done from https://github.com/ietf-tools/ieee-rawbib

andrew2net commented 12 months ago

Hi @stefanomunarini

No, the crawler should not just clone the ietf-ribose/ieee-rawbib. It should temporary clone it into ieee-rawbib and run converter. In GHA we used the shell command relaton fetch-data ieee-rawbib to run the converter. But in the crawler using Ruby method RelatonIeee::DataFetcher.fetch is a better way. We shouldn't add the ieee-rawbib folded to commit as it is just temporary source data.

We don't need to ask ietf-tools to fork the ieee-rawbib. As you can see in GHA we used token to access the repo token: ${{ secrets.IETF_BIB_BOT_PAT }}. You can use the token in git clone command. Also, our goal is get updates from the ieee-rawbib, but fork won't be updated. Finally, IEEE wants the ieee-rawbib be private. So no forks possible.

stefanomunarini commented 12 months ago

Thanks @andrew2net . Amongst the secrets in https://github.com/ietf-ribose/relaton-data-ieee/settings/secrets/actions I cannot see secrets.IETF_BIB_BOT_PAT. Which of the tokens listed in that page can be use to clone ietf-ribose/ieee-rawbib? Also, as I do not have access to ietf-ribose/ieee-rawbib, if you do, can you tell me if in that repository at root level there is a data/ folder we can use to copy data from? Or should we just clone into it, run the fetcher command, and it will generate data in the relaton-data-ieee repository?

andrew2net commented 12 months ago

@stefanomunarini I don't have access to this repository's tokens. Maybe the token stored in the organization level. Ping @ronaldtse. I also don't have access to ietf-ribose/ieee-rawbib, but from previous GHA we can see that we just need to copy root level. There are not data/ or any other that we need to pick up. The previous GHA step that cloned ietf-ribose/ieee-rawbib was:

    ...
      - name: Checkout source
         uses: actions/checkout@v2
         with:
           repository: ietf-ribose/ieee-rawbib
           token: ${{ secrets.IETF_BIB_BOT_PAT }}
           path: ieee-rawbib
    ...

it's equivalent to git command:

git clone https://<token>@github.com/ietf-ribose/ieee-rawbib.git ieee-rawbib
stefanomunarini commented 11 months ago

It seems like we may need to use username and token here instead, e.g. git clone https://username:token@github.com/user/repo. Because I do not have access to this repo secrets, can anybody add the username secret, or provide its placeholder if it's already present? @kesara do you have permissions to perform this operation?