Multitarget copy - Githubissues

Avalanche-io / pyc4_old

Python module for the Cinema Content Creation Cloud frame work.

Apache License 2.0

2 stars 0 forks source link

Multitarget copy #4

Closed mjaganathan closed 8 years ago

mjaganathan commented 8 years ago

Copy source to multiple destinations simultaneously.

mjaganathan commented 8 years ago

Joshua, could you elaborate on this issue. Do we have a particular folder structure like where would the source & destination files reside? Does multiple targets means any fixed number of destination copies or does it vary? Current implementation has the project path on Linux: \root\crd\c4py; Windows: d:/c4/c4py

mrjoshuak commented 8 years ago

@mjaganathan You don't need to change assignments. I got the message previously (we're both subscribed to the issue). I'm only able to get to it now.

See the c4go issue for details.

mjaganathan commented 8 years ago

@JoshuaKolden Taking the reference from c4go issue, we are working on this issue. Would update once its done.

mjaganathan commented 8 years ago

@JoshuaKolden, If I recursively traverse though my source directory and encounter 100 files, do we need to display the C4id's for each file in console or output them to a new file.

mrjoshuak commented 8 years ago

The process is iding the files at the same time it's copying them, so it should output the id's in progress as it does now without doing the copy. So in other words the ID process remains the same, we're just adding the copy functionality.

mjaganathan commented 8 years ago

@JoshuaKolden, The code has been submitted to work as per windows & linux platforms. Please review & provide comments. if any.

mrjoshuak commented 8 years ago

Got it. I'll take a look. Thanks

mrjoshuak commented 8 years ago

At c4.py:19 you open a file for reading. You use that to calculate the c4id. At no time while the file is open do you buffer it, or copy it to a 'target' location. In fact you never do any copying of the file anywhere in the code that I can find. I'm not sure how this has anything to do with "multitarget copy". As far as I can tell it doesn't even do a single copy.

Am I looking at the wrong thing?

mrjoshuak commented 8 years ago

Ok, I see at line 95 that you use shutil.copytree to do the copying.

No. This is completely against the point. This will not take any advantage from hashing the file because the operations are completely independent. The intent is that we should take advantage of the fact that we already have the data in memory to write it to multiple target files at once. In a single operation.

Open file for read
Open 1 or more files for write
Read block
Add block to hash
Write block to one or more target files
Repeat 2-4 until end of source file data

That's the simple version. The version I actually want has a buffer so that reading, hashing, and file writing can each run as fast as possible in different threads.

mjaganathan commented 8 years ago

@JoshuaKolden, In the current code as you understand takes advantage of the RAM memory. Its not explicitly handling memory management as you intend. It can only be achieved by revisiting functionality of shutil.copytree as per the need. Would incorporate the changes as needed.

mrjoshuak commented 8 years ago

No. I'm sorry but that is incorrect.

I'm afraid this isn't working out. I'm going to go ahead and end work here. Thank you for your efforts, but I just don't have the time to pursue this further.

Please accept my apologize for lack of time and poor communication.

mjaganathan commented 8 years ago

@JoshuaKolden, we have found a fast running approach. working on it. would update it soon.

mjaganathan commented 8 years ago

@JoshuaKolden, Have submitted revised code implementing both threading & non-threading approaches.