Add Drawings support - Githubissues

sebjf commented 3 months ago

Description

Add a command to bouncer client to import a DWG as an SVG to be saved to the new drawings collection, and hook this up to the appropriate (or new) queue in worker.

Goals

[ ] .io can create a job that results in a drawing being imported as an svg

Tasks

[x] Add a new importer or importer mode for DWGs that can write SVGs
[x] Add a new importer or importer mode for DGNs that can write SVGs
[x] Add a new command to bouncer client
[x] Create ModelRevisionNode subclass and change existing RevisionNode instances to be ModelRevisionNodes
[x] Create DrawingRevisionNode type
[x] Create a "drawing manager" to hold things like the db definitions that Scene Manager has for 3D scenes.
[x] Add logic to get the rFile from a Drawing Revision Node given a Revision Id in RepoManipulator
[x] Add method to FileManager to get fully qualified file from RefNode
[x] Add method to write drawing import stream to file and update drawing revision node with image
[x] Move DrawingImageInfo somewhere proper
[x] Clean up namespaces
[ ] Update bouncerworker

Related Resources

Product Ticket: https://github.com/3drepo/3D-Repo-Product-Team/issues/377 Control flow: https://github.com/3drepo/3drepo.io/issues/5006 .io Endpoints: https://github.com/3drepo/3drepo.io/issues/4983

sebjf commented 3 months ago

Hi @carmenfan , first question!

The way the SVG exporter works is that it is much like our exporter that runs on an ODA database object. So, the readFile methods of FileProcessorDgn or FileProcessorDwg would be very similar, except they'd call a different method (e.g. importSvg instead of importDgn after setting up the views etc.

Do you want me to refactor the existing classes so they have two methods, one for importing to models & another for importing to Svgs, or just create new classes for the svg import and have a small amount of duplicate code between them?

(My instinct is that this should be in the same class, even if we end up duplicating view setup etc to make slight changes - after all the class is named after the file type it receives, not the type it outputs...)

carmenfan commented 3 months ago

@sebjf agreed, same class because of the reasoning you said 😆

sebjf commented 2 months ago

Hi @carmenfan,

I have a version of this working now. It is set up as follows:

RepoManipulator has a new method, processDrawingRevision, which handles the import and update of the drawing revision node and ref nodes. A class, DrawingManager, interacts with the database to retrieve and update revision nodes. It is in this definition that the schema is implemented. DrawingImportManager marshals revised FileProcessorDwg or FileProcessorDgn instances, which use the ODA SVG exporter device to create an SVG from the source files. This class is only concerned with file processing; it takes in a filename and outputs bytes (& soon a calibration I expect).

DrawingImageInfo is used to communicate between the three. For now, this class just holds the drawing name and a vector of bytes for the svg contents. It is created by RepoManipulator and passed via DrawingImportManager to the file processors. It has analogies to both GeometryCollector and RepoScene but is not directly equivalent to either; the SVG exporter creates something that is far closer to what we write immediately to disk than RepoScene, and there is no intermediate step required unlike GeometryCollector. Therefore the same type is passed directly through to the file processors.

The file processors work by storing a pointer to the DrawingImageInfo as a member. In readFile(), if this is non-null, the drawing import code runs. Similarly, GeometryCollector instances can now be null, and if not, the model import code runs. (The motivation for this is that there's a lot common between the two, but the use of try/catch blocks and the explicit initialisation/deinitialisation of ODA resources make having separate functions difficult).

I now just need to polish it a bit, and had a few questions/things to check in regards to this! (These basically boil down to how much effort do you want to put into type safety/verbosity...)

(1) Are you happy with DrawingImageInfo being passed right through, or would you like an intermediary (or higher level) type for decoupling purposes?

(2) Currently the DrawingImageInfo member is added directly to FileProcessorDwg & FileProcessorDgn, do you want it in the FileProcessor base class?

(3) At the moment, we have three ref node schemas to consider:

Legacy/Models

{
    "_id" : "18f92421-5810-48a2-aaf2-6904011f5ad8.repobundle",
    "type" : "fs",
    "link" : "130/207/0bafc0a6-2f1f-4ea0-bf3c-ae5c4704bf0e",
    "size" : 193824,
    "encoding" : "gzip"
}

Drawing rFile

{
    "_id" : "0ac71fb2-a18d-4ef8-b5a8-b87879326300",
    "type" : "fs",
    "link" : "197/204/f959eda6-3fde-436a-ba0e-b1aa777a413d",
    "size" : 2419670,
    "name" : "corner.dgn",
}

Drawing Image

// drawings.history.ref
{
    "_id" : "0ac71fb2-a18d-4ef8-b5a8-b87879326300",
    "type" : "fs",
    "link" : "197/204/f959eda6-3fde-436a-ba0e-b1aa777a413d",
    "size" : 2419670,
    "mimeType": "image/svg+xml", 
    "project": LUUID("bfe0c3fe-dd1d-4844-ad04-cfb75df26a63"),
    "model": LUUID("aeb0c3fe-dd1d-4844-ad04-cfb75df26a63"),
    "rev_id" : LUUID("cad0c3fe-dd1d-4844-ad04-cfb75df26a63"),
}

Currently I am passing RefNodes and using existing APIs (such as the metadata argument to makeRefNode) to implement the schema.

(a) Shall we continue like this, or would you prefer a subclass for each of the above? (b) If we do add new classes/pathways, should we change the key type from string to LUUID on (2) and (3)?

(4) For testing I have added a flag, drawing, to the import command of 3drepobouncerClient, which triggers the drawing revision processing, rather than the model import. Would you like a new command for processing drawings, or leave as an import option? (Not sure of the implications for the queues etc for either of these!)

(5) Currently the namespaces need a bit of cleaning up. There are two new namespaces, repo::manipulator::drawingutility and repo::manipulator::drawingconverter, however DrawingRevisionNode is in repo::core::model, as are the file processors.

Do you want a new namespace repo::core::drawing, for example to hold DrawingImageInfo, and for them to be parallel where possible, or for it/them to be perhaps a child of model?

carmenfan commented 2 months ago

1) I think it's fine base on what you're saying, it's basically representative of the file

2) I think it should be in the baseclass

3) a) continue like this is fine as long as it works for you. Though I don't think the Rfile should have an incomplete flag? that should reside with the revision node like 3d, and also I don't think it needs an extension? b) I would prefer LUUIDs where possible

4) I think it's better if we have a separate command as it does pretty different things and takes in similar file types

5) repo::core should be representative to what we store in 3drepo database. Considering DrawingImageInfo doesn't live as an entity I don't think it should live there, sam ewith file processor (it's a manipulator, it's doing things to convert data). not sure if we should have a drawingconverter namespace either, can you tell me which class are in which namespace?

sebjf commented 2 months ago

Hi @carmenfan, thanks!

3 a) I think I misread and it should be in the drawing revision, though it won't make any difference to bouncer anyway! I will leave it like this then for now until there is a need to change it. 3 b) The way I have it now is that the DrawingRevisionNode stores all its entries as LUUIDs (including rFile and image), and then RepoManipulator turns them into strings when it hands them to FileManager. I will leave it like this if we aren't introducing dedicated types, since that is as far into LUUIDs as it is possible to go without introducing overloads throughout the file management classes.

5 The new classes are:

repo::manipulator::drawingutility::DrawingImageInfo repo::manipulator::drawingutility::DrawingManager repo::manipulator::drawingconverter::DrawingImportManager repo::core::model::DrawingRevisionNode

sebjf commented 2 months ago

Also

and also I don't think it needs an extension?

This was in response to

We will record the file extension in the object instead of having to regex it out of the rFile string like in containers.

Did that mean something else?

EDIT: Ah I expect I misread and it meant in the Drawing Revision Node! I will update it...

carmenfan commented 2 months ago

Also

and also I don't think it needs an extension?

This was in response to

We will record the file extension in the object instead of having to regex it out of the rFile string like in containers.

Did that mean something else?

EDIT: Ah I expect I misread and it meant in the Drawing Revision Node! I will update it...

Again I'm referring to the revision node 😆 so something Christos will handle on the backend

carmenfan commented 2 months ago

repo::manipulator::drawingutility::DrawingImageInfo repo::manipulator::drawingutility::DrawingManager repo::manipulator::drawingconverter::DrawingImportManager repo::core::model::DrawingRevisionNode

I think DrawingImportManager should probably live in modelConverter, if we see the file as the 3d model, then svg is just an outcome of converting the model.

Similar with the 2 in drawing utility, I think it'll be simplier to just put them in modelUtility.

This is just my thinking without looking at the code so do let me know if you don't feel like it make sense!

sebjf commented 2 months ago

Hi @carmenfan,

I have made those changes and am now looking at bouncer worker. I remember us discussing different possible queue configurations.

Looking at the code, messageDecoder (which builds the process args list from the queue object) is common. The only thing specific to the model import in modelQueueHandler is the processInformation, which expects some members specific to the import arguments (user, model, etc). However, this object only ever ends up in processMonitor, which only seems to use the Rid member.

jobQueueHandler is written a little differently but actually does the same thing as modelQueueHandler; using messageDecorder, runBouncerCommand, and generating the return message in the same way. The only real difference is the wording of the error in the catch block.

I am wondering then,

Since we aren't sure whether we will put drawings in the model queue or their own at this stage, shall I refactor modelQueueHandler into something like bouncerQueueHandler, that can handle all bouncer commands, so we have one type of handler that can be attached to multiple queues?
If yes, shall I also replace jobQueueHandler with bouncerQueueHandler, since we are working on it?

carmenfan commented 2 months ago

@sebjf prior to all that, can we run some stats to determine whether we want to use either of the queue or want to setup a new queue for drawings?

I think this all depends on how cpu/memory/time intensive it is going to be

sebjf commented 2 months ago

Some initial results for import times.

These figures were taken with the Visual Studio Performance Profiler in Memory Mode. Therefore the processing times are inflated, but memory should be quite accurate.

Drawing	Source File Size (Kb)	Time (s)	Memory (Mb)
Drawing10	1989	2	54
Drawing3	2885	4	65
Drawing4	3248	18	963
Drawing9	6051	7	135
Drawing6	37420	24	910
Drawing1	47971	31	945
Drawing5	60660	16	681
Drawing8	61524	9	570
Drawing2	62615	16	1015
Drawing11	141186	88	3800
Drawing7	141621	87	4200

The sample drawings are real drawings taken from a couple of different projects. The source file size is used as an approximation of 'complexity', though as expected the use of references & instances does affect the resources used.

Overall, the memory peaks for even the largest drawings are a fraction of those of the models, but they are not exactly tiny.

Anecdotally, the import process has two peaks: one reading the file, which for the largest drawings was about 1.6 Gb, and a second when writing the SVG, which depending on the drawing can be much larger.

3drepo / 3drepobouncer

Add Drawings support #687

Description

Goals

Tasks

Related Resources