mobiusklein / brainpy

A Python implementation of Baffling Recursive Algorithm for Isotopic distributioN calculations
http://mobiusklein.github.io/brainpy
Apache License 2.0
19 stars 10 forks source link

Segmentation fault #7

Closed Gscorreia89 closed 7 months ago

Gscorreia89 commented 7 months ago

Hi,

I have noticed a weird C segmentation fault error whenever elements with atomic number >= Polonium are present in the formula:

For example: molecularFormula = {'Cm':1} theoretical_isotopic_cluster = isotopic_variants(molecularFormula, npeaks=1, charge=1)

will cause a Segmentation fault (core dumped) error and kill the python session.

I have checked that all these elements are defined in mass_dict. It might be a red herring, but I also noticed a suspicious pattern in the mass_dict file entries: problematic elements contain entries where integers have been defined without a decimal place (and will be parsed as int instead of float in python), for example: 'Po': { 0: (209, 1.0), 188: (187.999422, 0.0),

where should have been 0: (209.0, 1.0)

Hope this is helpful to pinpoint the cause, happy to provide more details to help replicate issue if needed, and thanks a lot for this neat package!

mobiusklein commented 7 months ago

Thank you for reporting this. I think the problem is more that brainpy doesn't create an Isotope for any isotopologue whose natural abundance is 0, and that the code does not check if the isotopes data structure is empty before trying to interact with it. This seems to fit most (all?) elements whose atomic numbers are that large.

I can look at either providing better handling here, or at least a better error message.

mobiusklein commented 7 months ago

I think I've fixed this in the latest release. Could you please take a look at v1.5.16 which should be on PyPI shortly and see if it does what you expect?

Gscorreia89 commented 7 months ago

Hi,

Thanks for the fast fix! All seems to be working, just gave it a try and I no longer get a segmentation fault. Now they return a peak list object with a single Peak, which I would say is reasonable given the natural abundance = 0. I have not checked thoroughly the accuracy of these results for these larger elements. I am using brainpy to parse small molecule formulas from public DBs, and my issue was that some of these elements sporadically appear in entries in DB's and the fact that it caused a segmentation fault made it trickier to handle exceptions in python. Thus from my side I consider the issue fixed!

mobiusklein commented 7 months ago

Thank you again for reporting this. The database I am using had three ways of specifying elements that all looked similar, and I wouldn't have looked more carefully without this being raised.