Brewtarget / brewtarget

Main brewtarget source code repository.
GNU General Public License v3.0
312 stars 134 forks source link

Compiling Brewtarget from source for the first time produces a Segmentation Fault #669

Closed Jaczel closed 1 year ago

Jaczel commented 1 year ago

Hi,

Please let me know if I am reporting this bug incorrectly. Also, I'm not a cpp programmer so please forgive my ignorance in some of these areas.

I compiled the devel branch from source and was getting Segmentation Fault errors. This appears to be when loading the database. Running brewtarget through gdb and doing a backtrace gave me some insight into where to go to try and make progress. The output can be found here https://pastebin.com/raw/GnRQ8tYz

Making an edit to src/database/ObjectStore.cpp:1490 allowed me to finish the initial setup and Brewtarget now launches fine. The edits I made are below.

std::optional< std::shared_ptr<QObject> > ObjectStore::findFirstMatching(
   std::function<bool(std::shared_ptr<QObject>)> const & matchFunction
) const {
   auto result = std::find_if(this->pimpl->allObjects.cbegin(), this->pimpl->allObjects.cend(), matchFunction);
   if (result == this->pimpl->allObjects.end()) {
      return std::nullopt;
   }
   //return *result;
   return std::nullopt;
}

Once Brewtarget launched correctly I restored this file to original. But Brewtarget continues to launch correctly. I now cannot recreate the issue.

I suspect that the issue lies in creating the database for the first time. I have tried deleting ~/.config/brewtarget and re-cloning and compiling. Are there any other locations where brewtarget stores information?

Any insight would be appreciated.

Some possibly relevant information about my system:

Operating System: Ubuntu 22.04 Boost version: 1.80 Branch: devel

matty0ung commented 1 year ago

Hi,

Thanks for the detailed bug report, and for including logs and stack trace. This is exactly the place to raise it.

From the stack trace, it looks like the program found an old version of the database file and was trying to upgrade it. When you deleted the contents of ~/.config/brewtarget then the next run of the program would have just created a new default database, so that upgrade code would not have been run.

I suspect there's a bug somewhere where we're doing something silly with a shared pointer (eg creating two unrelated shared pointers to the same object) which will be my error, and that your temporary edit suppressed the symptoms (viz an object being prematurely deleted).

If, by any chance, you still have a backup copy of your old Brewtarget DB, it would probably help us reproduce the problem. Otherwise we might see if @mikfire has some old DBs, has he has helped enormously with bug-finding in the upgrade process.

Jaczel commented 1 year ago

Hi @matty0ung

Thanks for your quick response.

I scrounged up an old database from 2020. It spits out a bunch of errors similar to my previous output but I don't hit a seg fault. I'm still confused as to where and why this seg fault occurred... I can also see if I have a more recent database.

Output: https://pastebin.com/raw/cj95Vh7B

Database used: https://drive.google.com/file/d/1FiJ38Qnb1HTpnexeLI-tC6GVziEcWPhy/view?usp=sharing

matty0ung commented 1 year ago

Thanks for that. So, this latest one is a different issue. It seems some of the data in that DB file is not quite as we expect. But we could probably add something to the code that would fix things. I'll have a look.

(By way of background, for Hops, Fermentables and so forth, we store two types of record. One is the "parent" record that describes, eg, a particular Hop in general. Then, every time we use that Hop in a recipe we create a "child" record that records the use of that hop. This also means, you can edit, eg, alpha acid on a hop use in a recipe without changing the alpha acid values for other uses of that hop in other recipes. Long story short, each hop (or fermentable, or misc or yeast) record in the DB is either a "parent" or a "child". The error we're seeing with this 2020 DB is that there are some records that are trying to be both parent and child. The fix is probably to either make them only parent records or to split each of them into separate parent and child records.)

If you do also find a DB that creates the seg fault that would be great too, obvs.

matty0ung commented 1 year ago

I think the "Fermentable # 382 is its own parent!" and similar errors are related to https://github.com/Brewtarget/brewtarget/issues/657 - or at least I would expect to see the same "Fermentable # 382 is unexpectedly already used in recipe # Y" warning logged if we started trying to add Fermentable 382 to another recipe.

I have a bit of an idea for how to add code to make things more robust when we encounter "data errors" such as this, but I think I'd better tackle them in version 3.1 rather than version 3.0.x because it's not just a quick fix.

(Essentially, the idea is that, when we hit "X is its own parent" we should create a clone of X and make that clone the parent of X. Ideal place to trigger this would be NamedEntity::setParentKey and then use similar logic to the templated copyIfNeeded function in Recipe.cpp. But we'd have to add a pure virtual function to NamedEntity and implement it with some boilerplate code in each of the classes that derives from NamedEntity because, for reasons explained in NamedEntity.h, we can't use the "usual" trick of CRTP.)

Jaczel commented 1 year ago

Sorry for the delay on this one. I've managed to reproduce the error.

I built a new, clean Ubuntu 22.04 image in Virtualbox. I compiled the latest brewtarget and launched it without creating ~/.config/brewtarget (i.e. the experience that a new user would have).

The first run of launching Brewtarget complained about not being able to open the database. I then ran it again and the segmentation fault returned.

Please find attached the output of the two runs. brewtarget-error.txt

matty0ung commented 1 year ago

Thanks, that's useful diagnostic. Looks like there's a problem with the order in which things are getting initialised. Will have a look.

matty0ung commented 1 year ago

I've added some more diagnostics to the code and, hopefully, improved the logic it uses for working out where files are. The output you pasted suggested the code wasn't finding some of the "standard" locations (eg where default DB lives etc). Once https://github.com/Brewtarget/brewtarget/pull/677 is approved, if you get a chance to try again, that would be great. (It might still give errors, but hopefully we'll have additional clues about the root cause!)

Jaczel commented 1 year ago

Hi @matty0ung

I've pulled, recompiled and produced the same error by running Brewtarget twice. I've attached the logs below. To do this I did the following.

rm -rf ~/.config/brewtarget
cd brewtarget
git pull origin develop
cd ../brewtarget-build
cmake ../brewtarget
make
./brewtarget
./brewtarget

Please let me know if there is something else I can do.

brewtarget-error-677.txt

matty0ung commented 1 year ago

Thanks for bearing with us and providing the extra diagnostics and info.

The good news is I think there's a simple fix for the issue. You just need one extra step in your build process. After running make, if you run sudo make install, it will put not only the binary but all the other files it uses (including default database) in places where the code can find them. (Running which brewtarget should give /usr/bin/brewtarget on Ubuntu for instance.) Then run brewtarget (NB not ./brewtarget) and, fingers crossed, everything should work.

Let us know how you get on.

Meanwhile, I'll have a look at our build scripts to see if we can make them issue a prompt for the extra build step.

Jaczel commented 1 year ago

Ah, that solved it. I had just assumed that compiling and running locally would be enough. The first run after running sudo make install produced a seg fault. I then removed ~/.config/brewtarget, ran brewtarget again and the program launched. I have posted the successful result in case it is of any use.

brewtarget-error-677-make-install.txt

Jaczel commented 1 year ago

With that in mind, I think I'm happy to close this issue. Assuming it's expected behaviour that sudo make install must be run for brewtarget to launch.

Let me know your thoughts @matty0ung

matty0ung commented 1 year ago

Very glad to hear you got things working.

I'll leave this issue open for now, as I think there are still some start-up things we could be doing better. (Looks like PersistentSettings is returning an empty string instead of std::nullopt when we don't yet have a config file. Doesn't break anything but it's untidy and hopefully a small fix. Also call of ObjectStore::TableDefinition on nullptr shouldn't happen, even on first-ever run of the program, so I'll try to have a think about what might be going on there.)

matty0ung commented 1 year ago

Closing now that release 3.0.4 is out.