Closed sebastian-luna-valero closed 5 years ago
Output can be found in https://travis-ci.org/cgat-developers/cgat-apps/jobs/500769778
The bam2geneprofile issue is a rounding issue. For example:
differences: 0 upstream2500bp_zoomedTo1000bp 0 0.05263157894736842
0 upstream2500bp_zoomedTo1000bp 0 0.0526315789474
The test file is rounded to fewer decimal places. I suspect that this is due to a numpy version issue using different output rounding. I wonder if setting np.set_printoptions(precision=13)
may help in this case?
Same issue for bam2peakshape.
However, for runGO error, 12 days ago @AndreasHeger made the script raise a not implemented error on line 1325 of GO.py, im not entirely sure of the reason for this?
For runGSEA, it seems jpeg is now not supported as a format for matplotlib output. This can easily be fixed by changing the ouput to png or svg.
Thanks @sebastian-luna-valero. I am happy to make the changes and test once I get more information regarding the runGO error and feedback from rounding in numpy.
Hi @sebastian-luna-valero, when I upgraded to the newest version of numpy I realised that the rounding is now different to 13 decimal places. This broke some of our tests so I remade the tests that show this.
For the runGO, I implemented a qvalue solution in python because the rpy2 version was deprecated.
For runGSEA I changed the instances of jpeg to svg and on linux this now works.
However, the tests on linux are now passing but the runGSEA is failing for OSX with the following error:
libc++abi.dylib: terminating with uncaught exception of type NSException
Iv never seen this before but was wondering if you had. Im going to look into it a bit more. Unfortunately I don't really have a OSX system to test at the moment so cant get to the bottom of this yet. Could it be that OSX doesn't like the svg and maybe I should output as a png? Seems a strange error though.
Thanks, Adam.
I don't have an OSX system at hand either. Could you please try png and see what happens before digging deeper into the issue?
Best regards, Sebastian
So changing the output of images to png made no difference. Im wondering if this is a similar issue to here: https://github.com/conda-forge/libcxx-feedstock/issues/29
Seems like that the libcxx version may be the issue. Without having a full OSX environment its difficult to test and fix this. However, I may have some time over the weekend to work on this. I may merge branch, considering that the linux build is working fine.
I think we should troubleshoot further before merging.
I tried the following without success: https://github.com/MTG/sms-tools/issues/36
Sorry, guys, will get back into this.
When trying to get cgat-apps through I removed the last of the rpy2 dependecies. I think it is most used for doing FDR, for which there are now scipy equivalents.
The question is how often the GO and GSEA functionality is used.
I have used these recently. However, I did not entirely trust our GSEA and have used fgsea in the end which does clearer plots. fgsea still does not do the same as the java app of GSEA provides, because the statistics differ a bit between the two. I think when I directly compared the CGAT GSEA yielded yet different results and was difficult to run with a lot of troubleshooting required, and I gave up at that point.
The GO analysis or enrichment analysis written by Katy works better and is reassuringly conservative, and I do use that more regularly, as the alternatives there are clunky. I especially like the implementation of backgrounds which works well for me.
Jakub
On 7 Mar 2019, at 21:44, Andreas Heger notifications@github.com<mailto:notifications@github.com> wrote:
The question is how often the GO and GSEA functionality is used.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/cgat-developers/cgat-apps/issues/33#issuecomment-470706767, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALmOsjjn22jmk3PWD_bzt8rJ9jkf1HU_ks5vUYgggaJpZM4baW2I.
Hi all, thanks for the comments, it would be my suggestion then to remove runGSEA, given that I struggle to read the code that is there and that it produces outputs that may be incorrect. This was written by Reshma as a training exercise and I don't think it was ever really checked for accuracy properly.
If everyone is in agreement then I will remove this script and remove it from the pipeline_enrichment. It will also reduce quite a bit of code that needs to be maintained (especially since it doesn't seem to be used by anyone).
@AndreasHeger I modified the FDR to make runGO.py python compatible instead of relying on r.
Test now passing, I am now removing gsea from enrichment pipeline. However, im coming up agains a few problems with testing that are related to cgat-core and the collect_benchmark_function. I will most likely raise an issue in cgat-flow if I cant fix it
The tests for the following scripts are currently failing:
bam2geneprofile
bam2peakshape
runGO
runGSEA