opensourceBIM / BIMserver

The open source BIMserver platform
GNU Affero General Public License v3.0
1.56k stars 611 forks source link

Local checkin #963

Open tchegito opened 5 years ago

tchegito commented 5 years ago

Not a bug, but maybe a silly question:

Is there a way to quickly checkin files that are local to the server ?

For example, I installed my BIMServer on a remote machine, and I have some big IFC files on it. In BIMVie.ws there's 2 ways to checkin: file or URL. But the file has to be on the client machine, not on the server. And even with URL, I can't have decent performance.

With "checkin with URL" I was expecting the same speed than with a "curl ", just for the transfer. But this seems to be as long as if the file was a remote one. Screen displays "Deserializing..." instead of "Uploading..." (that we have with Checkin from file). Even with an URL in the same domain, it seems to be a remote download.

I'm talking about transfer only, indeed. That time doesn't include inverses generation, or geometry. My analysis is only based on the slow growing file in "WEB-INF/incoming/" folder.

Maybe there's something I missed ?

tchegito commented 5 years ago

Finally my question is maybe unclear ;) I could reformulate like this:

Unless if there's already a way to do this, maybe I'll give it a try on a pull request.

rubendel commented 5 years ago

This is what happens during checkin: As soon as the first bits of the file arrive on the server, it starts processing the IFC entities, and also storing the records in the database [1]. So the perceived slow upload speed, is not actually the upload speed, it will only upload as fast as the BIMserver can process and store the data (obviously there are some buffers in between). There are a few reasons for doing it this way (as opposed to how it worked a few years ago):

As for the file size in "incoming", this is only growing at the same speed as the IFC file is processed, the stream is just multiplexed. So it's not a serial process, but parallel.

So to get back to your original question, sure that could be implemented, but I really don't think there is a need for it, you'll observe basically the same processing time. Most probably more than 90% is being spent on geometry generation anyways (if not, you might need more memory).

[1] This statement is not completely true, objects are only flushed to disk every 1.000.000 records, but even then, they are only regarded as valid when the transaction commits. Also, the nature of IFC files makes it impossible to store certain entities as they arrive because of the references that are present.

tchegito commented 5 years ago

I just implemented a "Local checkin" on my forked repo, and performance gain is drastic ! Instead of 3 hours with an 176Mb IFC, now it only takes 3 minutes. Major time consuming part was upload, indeed, but that really matches with my customer's needs. image

In his context, IFC files comes from the same network domain, so it's fast to copy them from one server to another. But with the 2 current ways of checking in, I was unable to get acceptable performance.

Now it's ok. If you're interested in the feature (it's rather simple), it's here, and I could make a pull request.

rubendel commented 5 years ago

Are you saying your internet connection is 133kb/s? Wow that's really slow (just out of interest, where are you based?). But yes in that case I can understand why this is a solution for you. I can imagine a few more people being interested in this, but I am a bit concerned about the security implications. This basically opens up a way to read any file accessible by the user running BIMserver. Do you have any ideas on that?

tchegito commented 5 years ago

Hopefully no ;) My connection (and my customer's one too) has an upload rate around 3Mb/s. We don't have optical fibre, and sometimes upload could be worst. I understand your concerns about security. Actually, the only limit is that the file should be readable by "tomcat" user, the one which runs Tomcat, indeed. But this is not really done on purpose. In the scenario where you would like to integrate that feature in BIMServer, I think maybe a reserved folder could be a good choice. Either a configured folder in the server settings, or a default one in "WEB-INF/localIfcs" for example. Honestly, I didn't think about that before you raised it.

By the way, I did further tests with the huge IFC (mentioned in this issue https://github.com/opensourceBIM/BIMserver/issues/962 ) and it "only" took 20 minutes. Actually, the involved model could have been optimized because it contains a lot of repetitions of complicated meshes, but that's not an option for my customer, because he's not responsible for the IFC files.

rubendel commented 5 years ago

That sounds like a good idea. If you feel like implementing it that way, that would be great.

Repetitions should usually not be a problem, but they need to be modelled correctly for IfcOpenShell and BIMserver to detect them (using Mapped items).

rubendel commented 5 years ago

Let me know if you want to pick this up, I'll leave the issue open then, otherwise I'll close it.

tchegito commented 5 years ago

I'll try to work on this next week, considering the feature is already done on my branches, except for the security matter.

halmsx commented 5 years ago

even on my local machine, it took at least 20mins for a 100MB ifc file to complete processing. hoped this to be added to main dev.

I just implemented a "Local checkin" on my forked repo, and performance gain is drastic ! Instead of 3 hours with an 176Mb IFC, now it only takes 3 minutes. Major time consuming part was upload, indeed, but that really matches with my customer's needs. image

In his context, IFC files comes from the same network domain, so it's fast to copy them from one server to another. But with the 2 current ways of checking in, I was unable to get acceptable performance.

Now it's ok. If you're interested in the feature (it's rather simple), it's here, and I could make a pull request.

rubendel commented 5 years ago

Two pull requests would be nice :)