cgat-developers / cgat-apps

cgat-apps repository
Other
33 stars 14 forks source link

Tests failing #33

Closed sebastian-luna-valero closed 5 years ago

sebastian-luna-valero commented 5 years ago

The tests for the following scripts are currently failing:

sebastian-luna-valero commented 5 years ago

Output can be found in https://travis-ci.org/cgat-developers/cgat-apps/jobs/500769778

Acribbs commented 5 years ago

The bam2geneprofile issue is a rounding issue. For example:

differences: 0  upstream2500bp_zoomedTo1000bp   0   0.05263157894736842
0   upstream2500bp_zoomedTo1000bp   0   0.0526315789474

The test file is rounded to fewer decimal places. I suspect that this is due to a numpy version issue using different output rounding. I wonder if setting np.set_printoptions(precision=13) may help in this case?

Same issue for bam2peakshape.

However, for runGO error, 12 days ago @AndreasHeger made the script raise a not implemented error on line 1325 of GO.py, im not entirely sure of the reason for this?

For runGSEA, it seems jpeg is now not supported as a format for matplotlib output. This can easily be fixed by changing the ouput to png or svg.

Acribbs commented 5 years ago

Thanks @sebastian-luna-valero. I am happy to make the changes and test once I get more information regarding the runGO error and feedback from rounding in numpy.

Acribbs commented 5 years ago

Hi @sebastian-luna-valero, when I upgraded to the newest version of numpy I realised that the rounding is now different to 13 decimal places. This broke some of our tests so I remade the tests that show this.

For the runGO, I implemented a qvalue solution in python because the rpy2 version was deprecated.

For runGSEA I changed the instances of jpeg to svg and on linux this now works. However, the tests on linux are now passing but the runGSEA is failing for OSX with the following error: libc++abi.dylib: terminating with uncaught exception of type NSException

Iv never seen this before but was wondering if you had. Im going to look into it a bit more. Unfortunately I don't really have a OSX system to test at the moment so cant get to the bottom of this yet. Could it be that OSX doesn't like the svg and maybe I should output as a png? Seems a strange error though.

sebastian-luna-valero commented 5 years ago

Thanks, Adam.

I don't have an OSX system at hand either. Could you please try png and see what happens before digging deeper into the issue?

Best regards, Sebastian

Acribbs commented 5 years ago

So changing the output of images to png made no difference. Im wondering if this is a similar issue to here: https://github.com/conda-forge/libcxx-feedstock/issues/29

Seems like that the libcxx version may be the issue. Without having a full OSX environment its difficult to test and fix this. However, I may have some time over the weekend to work on this. I may merge branch, considering that the linux build is working fine.

sebastian-luna-valero commented 5 years ago

I think we should troubleshoot further before merging.

I tried the following without success: https://github.com/MTG/sms-tools/issues/36

AndreasHeger commented 5 years ago

Sorry, guys, will get back into this.

When trying to get cgat-apps through I removed the last of the rpy2 dependecies. I think it is most used for doing FDR, for which there are now scipy equivalents.

AndreasHeger commented 5 years ago

The question is how often the GO and GSEA functionality is used.

jscaber commented 5 years ago

I have used these recently. However, I did not entirely trust our GSEA and have used fgsea in the end which does clearer plots. fgsea still does not do the same as the java app of GSEA provides, because the statistics differ a bit between the two. I think when I directly compared the CGAT GSEA yielded yet different results and was difficult to run with a lot of troubleshooting required, and I gave up at that point.

The GO analysis or enrichment analysis written by Katy works better and is reassuringly conservative, and I do use that more regularly, as the alternatives there are clunky. I especially like the implementation of backgrounds which works well for me.

Jakub

On 7 Mar 2019, at 21:44, Andreas Heger notifications@github.com<mailto:notifications@github.com> wrote:

The question is how often the GO and GSEA functionality is used.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/cgat-developers/cgat-apps/issues/33#issuecomment-470706767, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALmOsjjn22jmk3PWD_bzt8rJ9jkf1HU_ks5vUYgggaJpZM4baW2I.

Acribbs commented 5 years ago

Hi all, thanks for the comments, it would be my suggestion then to remove runGSEA, given that I struggle to read the code that is there and that it produces outputs that may be incorrect. This was written by Reshma as a training exercise and I don't think it was ever really checked for accuracy properly.

If everyone is in agreement then I will remove this script and remove it from the pipeline_enrichment. It will also reduce quite a bit of code that needs to be maintained (especially since it doesn't seem to be used by anyone).

Acribbs commented 5 years ago

@AndreasHeger I modified the FDR to make runGO.py python compatible instead of relying on r.

Acribbs commented 5 years ago

Test now passing, I am now removing gsea from enrichment pipeline. However, im coming up agains a few problems with testing that are related to cgat-core and the collect_benchmark_function. I will most likely raise an issue in cgat-flow if I cant fix it