ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
224 stars 148 forks source link

[🐈 Task]: Putting Log Files from Workflow Errors into Artifacts #559

Closed miquelduranfrigola closed 2 months ago

miquelduranfrigola commented 1 year ago

If the Actions fails: save the Log Files as artifacts so a contributor can export it. Explore ways to add a link to the artifact Log File so a contributor or maintainer can resolve the issue.

miquelduranfrigola commented 1 year ago

Hi,

There are several log files that are created at fetch time. At the moment, they are stored in temporary folders, which makes it difficult to trace them. I will collect these logs files in a pre-defined folder, so I hope it will be easier to gather the artifacts.

At serve time, I think there is only one log file, so it will be easy.

I have assigned this issue to myself. If I face problems with including artifacts, I will ask @megamanics or @honeyankit . Thanks!

megamanics commented 1 year ago

@miquelduranfrigola multiple paths with wildcards can be passed to collate the artifacts

      - name: Archive multiple artifacts
        uses: actions/upload-artifact@v2
        with:
          name: fetch-logs
          path: |
            dist
            !dist/**/*.md
miquelduranfrigola commented 1 year ago

Thanks @megamanics this is helpful!

GemmaTuron commented 1 year ago

This is related to: https://github.com/ersilia-os/ersilia/issues/537 We will need to collect log files in the same directory first

GemmaTuron commented 1 year ago

This issue is not a priority, we will leave it open and tackle when possible

GemmaTuron commented 11 months ago

@miquelduranfrigola maybe this is a good time to re-take this? Could we assign it to one of the new interns?

miquelduranfrigola commented 10 months ago

Good idea. Are you still interested, @GemmaTuron ?

GemmaTuron commented 10 months ago

Yes, who do you want to assign it to @miquelduranfrigola ?

miquelduranfrigola commented 10 months ago

Let's ask in the internships channel? Whoever has some experience with GitHub Actions.

DhanshreeA commented 5 months ago

@miquelduranfrigola this isn't currently happening right?

GemmaTuron commented 5 months ago

no, this has not been tackled yet so it might be a good moment to do so!

DhanshreeA commented 4 months ago

@dzumii would you like to take this up?

dzumii commented 4 months ago

Ok! I will check it out and let you know if I have questions or encounter any blocker

miquelduranfrigola commented 4 months ago

Thanks all. Any feedback needed, please let me know.

dzumii commented 4 months ago

@GemmaTuron @DhanshreeA @miquelduranfrigola To get the log files, I have been focusing on the eos/tmp directory and found that nothing is really happening there in terms of the run logs. @DhanshreeA just made me know over the call now that these files are created in the /tmp directory in the root directory. A couple of folders are created there. Still a bit confused about which of the files we want to actually upload as artifacts.

miquelduranfrigola commented 4 months ago

OK thanks @dzumii. I am a little bit confused - I am sure I am missing something, so apologies beforehand. In my opinion, the logs that we should be keeping are in the file eos/current.log and/or eos/console.log. Also, when models are being tracked, we could consider improving/enlarging the eos/ersilia_runs/logs/... files, but I would deprioritize this for now. In any case, I think that keeping info from the tmp directory may be difficult since many, many files are generated. @DhanshreeA - am I missing something? Is there a reason why we want to keep more than what is found in the .log files in the eos folder? If so, should we maybe channel dynamically the missing information into those logs?

DhanshreeA commented 4 months ago

@miquelduranfrigola I've generally found the cycle to debugging model issues a bit long considering that the bentoml server logs, as well as subprocess logs from run.sh - which get generated in /tmp directory of a host - do not fully get captured in either current.log, or console.log, and I think we're both talking about the same thing from slightly different angles - it would indeed be helpful to increase the scope of logs we collect in something like eos/ersilia_runs/logs/..., especially to make model troubleshooting easier for maintainers/contributors. Indeed we can deprioritise it for now, in which case this issue can be successfully closed since both console.log, and current.log will be uploaded as artifacts for each model with a 14 day retention period (subject to model workflows being updated).

miquelduranfrigola commented 4 months ago

OK, I see and I'm agree. @DhanshreeA perhaps a good moment to think about this is actually now when we are trying to run multiple sessions at the same time. As part of this procedure, we can redirect logs into appropriate folders as well. Let me know what you think

DhanshreeA commented 4 months ago

I absolutely agree @miquelduranfrigola - this would be super useful to have now.

miquelduranfrigola commented 4 months ago

OK then let's go for it during this week. @DhanshreeA and @miquelduranfrigola can give it a first go and then we can share progress with @dzumii on Wednesday or Thurday. Sounds good?

GemmaTuron commented 2 months ago

Hello @DhanshreeA

What is the status of this? Did we merge @dzumii's work?

DhanshreeA commented 2 months ago

I think this can be safely closed.