chanzuckerberg / shasta

[MOVED] Moved to paoloshasta/shasta. De novo assembly from Oxford Nanopore reads
Other
272 stars 59 forks source link

Please contemplate supporting compatibility with old kernels #236

Closed gitcruz closed 3 years ago

gitcruz commented 3 years ago

Dear Paolo,

I understand your concerns about supporting old kernel compatibilities. But scientific clusters, like ours, tend to have stable and old kernels in order to acomodate robust production pipelines. This is something very hard to change. On the other hand our center is producing ONT data at large scale (GridION and a PromeThION) and we assemble them in our own cluster without having to use AWS.

Our IT support team, have tried to containerize version 0.7.0 but also relies on our old kernel...conflicting with it

Could you contemplate releasing shasta-OldLinux-release files for a while?

I would really appreciate if you do so.

Thanks, Fernando

paoloczi commented 3 years ago

In release 0.7.0 we decided to remove support for kernels older then 3.2.0 because maintenance updates for CentOS 6.0 were ending on 11/30/2020 (see here for more information). As far as I know, all currently supported Linux systems use kernel 3.2.0 or newer, and therefore the released Shasta executable runs on them. Can I ask what Linux version your cluster is using? Using a very old system exposes your cluster to hacking due to the lack of security updates, unless you keep it from accessing the Internet entirely.

If we want to return to supporting older kernels, there are two ways of doing it:

  1. Build Shasta on an older Linux system such as Ubuntu 16.04 (we currently build on Ubuntu 20.04) which has an older compiler version that generates executables compatible with older kernels. This is not practical because it ties us to compatibility with an old compiler. This would mean, for example, that we could not benefit from compiler improvements that happened in the last few years, including bug fixes, support for new C++ features, and performance improvements. It would also force us to only use other packages that can also be built on the same old compiler version.

  2. It is possible to build a specially configured version of the current version of the compiler that will generate executables compatible with older kernels. However this requires some serious system work, I have never done it, and given that the Shasta team is currently composed of just myself it would be hard for me to invest in this. Would you or somebody in your group be willing to do this work? I could provide guidance. I am also adding a "help wanted" label to this issue to see if somebody volunteers.

I would rule out option 1. for the reasons I mentioned above. Option 2. is more practical, but we have to find a way to make it happen, and of course it is always possible that it ends up not being feasible for a reason or another.

gitcruz commented 3 years ago

Thanks for your quick and positive response, Paolo.

I understand the overload of maintenance is not a priority. However I would like to benefit from the improvements of new releases. In fact, 0.6.0 completed an assembly in 2 hours while v0.1.0 has been running 1day and 5 hours and still haven't computed much of the output (still at tmp-LowHash-Buckets.count).

I might point to my colleague in the IT department to this conversation. Do you think option 2 worths to be done now or on your next release v0.8.0?

If you want, I can tell you the kernel version in a separate email. You really scared me with the security issues! :)

Best, Fernando

El vie., 5 mar. 2021 16:01, paoloczi notifications@github.com escribió:

In release 0.7.0 we decided to remove support for kernels older then 3.2.0 because maintenance updates for CentOS 6.0 were ending on 11/30/2020 (see here https://en.wikipedia.org/wiki/CentOS#End-of-support_schedule for more information). As far as I know, all currently supported Linux systems use kernel 3.2.0 or newer, and therefore the released Shasta executable runs on them. Can I ask what Linux version your cluster is using? Using a very old system exposes your cluster to hacking due to the lack of security updates, unless you keep it from accessing the Internet entirely.

If we want to return to supporting older kernels, there are two ways of doing it:

1.

Build Shasta on an older Linux system such as Ubuntu 16.04 (we currently build on Ubuntu 20.04) which has an older compiler version that generates executables compatible with older kernels. This is not practical because it ties us to compatibility with an old compiler. This would mean, for example, that we could not benefit from compiler improvements that happened in the last few years, including bug fixes, support for new C++ features, and performance improvements. It would also force us to only use other packages that can also be built on the same old compiler version. 2.

It is possible to build a specially configured version of the current version of the compiler that will generate executables compatible with older kernels. However this requires some serious system work, I have never done it, and given that the Shasta team is currently composed of just myself it would be hard for me to invest in this. Would you or somebody in your group be willing to do this work? I could provide guidance. I am also adding a "help wanted" label to this issue to see if somebody volunteers.

I would rule out option 1. for the reasons I mentioned above. Option 2. is more practical, but we have to find a way to make it happen, and of course it is always possible that it ends up not being feasible for a reason or another.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/chanzuckerberg/shasta/issues/236#issuecomment-791473659, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB34KVIMW4U4FW72IBX6XBTTCDW4RANCNFSM4YU726SQ .

paoloczi commented 3 years ago

Yes, there are many benefits from using newer Shasta versions, not just in performance, but also in accuracy and functionality. And it would be unfortunate if you had to be stuck with release 0.6.0.

For best performance in a production environment, make sure to use options --memoryMode filesystem --memoryBacking 2M, which however require root access. If you do this, you need to use Shasta command cleanupBinaryData after running the assembly when you no longer need the Shasta binary data. See here for more information.

If anybody is willing to work on option 2, that can be done at any time. We don't have to wait for a release for that.

Yes, please give me more information about the Linux system you use in your cluster - both the Linux distribution and the kernel version. Some of my colleagues are expert in containers and I can ask for their opinion why you were not able to run 0.7.0 on an old kernel, even in a container, and for possible solutions using containers.

Security is definitely an issue if you have an old system that is no longer receiving maintenance updates (such as CentOS 6) and has access to the Internet. If you have a cluster, your IT department could gradually convert it to a newer Linux system a few machines at a time - it does not have to be an "all or nothing".

gitcruz commented 3 years ago

Hi Paolo,

This is the information on our Linux system and kernel: lsb_release -a LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 6.7 (Santiago) Release: 6.7 Codename: Santiago

uname -r 2.6.32-696.13.2.el6.Bull.128.x86_64

paoloczi commented 3 years ago

Thank you for that information! Let me ask around a bit.

You could also try the Docker images documented here, but it is possible that you will bump into the same issues.

Since you are on RedHat 6 (and not CentOS 6), RedHat will continue to provide "Extended life cycle support" until June 2024. However, according to RedHat documentation, "Extended life cycle support" does not include security fixes. So security is an issue, if your system has Internet access.

gitcruz commented 3 years ago

Hi Paolo,

A colleague in the IT department finally manage to install Shasta v0.7.0 inside a singularity container. The container was created with unbuntu-16. (still supported by our kernel). I tested it in two different vertebrate genomes and works well.

Thanks, Fernando

paoloczi commented 3 years ago

Thank you for letting us know. If you of your colleague can give a detailed description of what you did and post it here, it could be helpful to other people in a similar situation.

misago162 commented 3 years ago

Hi. I'm the technician that has installed this software inside a singularity container.

It is an easy task if you are familiar with working with singularity containers. I had to use an Ubuntu-16 in the container because in our cluster we have an old operative system and it is not possible to use another release of ubuntu:

1.- Create a sandbox singularity container, you have to be the root user or have sudo privileges to execute the singularity command. 2.- As root, enter in write mode inside the container and install the prerequisite to be able to build shasta-0.7.0 following the normal instructions to build shasta:

https://chanzuckerberg.github.io/shasta/BuildingFromSource.html

Note: as I had to use ubuntu-16, the scripts:

shasta/scripts/InstallPrerequisites-Ubuntu.sh

fails because this script only works with a ubuntu-20. What I had to do is open this script, check the list of packages installed using apt-get, and install this list of packages inside the container using the apt command. If in your singularity container you are using ubuntu-20, I guess you only have to execute the InstallPrerequisites-Ubuntu.sh script.

3.- Again, following the instructions in the page

https://chanzuckerberg.github.io/shasta/BuildingFromSource.html

just clone and build shasta as is described in the web page

4.- Convert the sandbox container to a sif container and you can start using the shasta using this new singularity container.

As I mentioned before, to do these steps you have to be familiar with the singularity container, but is an easy an basic procedure.

paoloczi commented 3 years ago

Thank you for posting this information. This could be useful to other people in a similar situation.

paoloczi commented 3 years ago

I will close this issue, but please reopen it if additional information or related topics/questions emerge.