Open mrotival opened 3 years ago
Hi Maxime,
this was a very thorough investigation, thanks a lot! You very nicely localized the bug inside the counting code for --soloFeatures SJ. This is not a widely used option, so bugs are to be expected.
Would it be possible for you to share the data for one of the failed runs - the smallest one, e.g. one of the separate lanes that failed? To debug, I need to reproduce this failure locally.
Cheers Alex
Dear Alex,
Thank you for your quick response, and sorry for not coming back to you earlier.
The reason why I took so long is that the data I’m using is human data which can theoretically be used to cell genotypes of the donors and identify them. Thus, under French law, there are legal restrictions on sharing these data.
Specifically, to comply with those restrictions and before sending a sample of the data, I would need a statement from you that the data will only be used for software debugging purposes and that it will be deleted from your servers once that task in complete.
I'm attaching an example of such statement. If you could sign it and send it back to me as pdf, I’ll be able to send you the data for one of the problematic sequencing lanes.
Best, Maxime
Le lun. 30 août 2021 à 15:46, Alexander Dobin @.***> a écrit :
Hi Maxime,
this was a very thorough investigation, thanks a lot! You very nicely localized the bug inside the counting code for --soloFeatures SJ. This is not a widely used option, so bugs are to be expected.
Would it be possible for you to share the data for one of the failed runs
- the smallest one, e.g. one of the separate lanes that failed? To debug, I need to reproduce this failure locally.
Cheers Alex
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/alexdobin/STAR/issues/1337#issuecomment-908356884, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFLHS75ZWFFLKKZ7ZQ2CGBDT7ODUNANCNFSM5C3RUC7A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hi Maxime,
sure, I can sign such a statement. It did not get attached, you would need to do it from the Github site.
Cheers Alex
Dear Alex,
I am processing a large number (N>50 so far) of 10x libraries with STAR solo, each being divided across 8 lanes. For the vast majority of libraries STARsolo runs just fine (and extremely quickly !!!), but for two libraries (so far), it returned a segfault.
I'm running STAR on a HPC running under the following configuration.
I typically request 16 CPUs and 140Gb of RAM per job and I use the following command:
and obtained a segfault the last lines of the Log.out file were (full log attached for one library)
First hypothesis: a corrupted file ?
At first, i suspected a corrupted input file, so I tried running each lane one by one:
For one library, STARsolo ran correctly for 7 out of 8 lanes, and removing the problematic lane solved the issue.
For the other library, STARsolo crashed on both lane 3 and lane 5, and ran correctly on lanes 1,2,4,6,7,8. Yet, removing lanes 3 and 5, did NOT solve the issue.
running
I obtained the same segFault at the same point.
This would seem to indicate that it's not just a matter of the file being corrupted, since a bunch of files that work when they are taken separately, can cause a crash when run together.
Locating the issue
Based on comparison with Libraries that ran correctly, my understanding is that the program crashed during the UMI collapsing phase for the Splice junctions counting.
And indeed, removing
SJ
from--soloFeatures
seemed to solve the issue (I'll keep on going without the SJ flag for the time being, but I'm still reporting the issue, in case it might affect others)In contrast, the segfault was still present when removing parameters related to UMI deduplication, either one by one, or all at once.
Any idea what might be wrong here ? possibly a bug ? Let me know if there is anything else I can provide to help debugging.
Best,
Maxime L41.Log.out.txt