UMPsychMethodsCore / MethodsCore

All of the projects that the methods core develops, combined into one repository!
7 stars 0 forks source link

subject specific master data file #320

Closed heffjos closed 7 years ago

heffjos commented 11 years ago

@rcwelsh @mangstad @sripada

Would we want to go one level further to run specific data files or is the subject level the lowest level we want to go?

rcwelsh commented 11 years ago

That certainly would be the proper generalization. I'm for it. But somehow it needs to be super user friendly. Could there by any way to make it in such a way that the MasterDataFile could either hold pointers or data or a mix, and the same for the subject specific files, they would hold pointers to run specific files or a mix as well. Once you have the code written for the MasterDataFile reader to pickup on either data or pointers inside the file to other files that is easy enough to then apply to the subject data.

I like it. This would allow for a good amount of flexibility. Flexibility is always a good thing.

dankessler commented 11 years ago

I agree that it makes sense to point to run-level files, since the point is to free the user of having to concatenate their run-specific files in the first place.

I imagine the "mix" scheme could be a bit trickier to implement, but I get the idea. I wonder if it would be simpler to think of it more like a "scanfile" where it just points to where their run-specific data is, which could help if people have unusually nested designs or something.

heffjos commented 11 years ago

How often do you think users would use a "mix" scheme? If it is rarely used, users would need to create a data file that lists only pointers as an extra step whereas they could have just specified a file path in the MasterDataFilePath variable.

rcwelsh commented 11 years ago

@heffjos : just checking in, where do we stand on this?

could you just build it such that is a pointer appeared in any file it would insert all that information? pointer could just be delimited by a special character. if we are suggesting to use "#" for comments, then it could be '@" to indicate a pointer to another file to be read at that point.

to model e.g.

subject run event …..

12345 1 1 …. 12345 1 2 …. @/net/data1/MyExperiment/MySubjectFiles/expDescrip_12335.csv

mangstad commented 11 years ago

Honestly I don’t think there are any substantial number of users who need or want a feature like that. If a user wants to take the time to create a master file with subject information, they would most likely prefer our existing master data file structure so that every subjects’ data is immediately accessible. If on the other hand they want subject specific files, they’re likely not going to want to take the time to build a central file that points to those subject specific files. I don’t really think a hybrid approach like this adds anything to the usage except complexity.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

rcwelsh commented 11 years ago

it's a general solution that is actually easy to build. it's just recursive, nothing complex about that. the core code stays the same to read in a line, if it detects the "@" then it just calls itself. adding the "@" capability will be rather straight forward.

sripada commented 11 years ago

I am not sure I quite understand Robert's proposal. But Joe, if you understand what Robert has in mind, do you want to take it on?

On Fri, Nov 1, 2013 at 9:35 AM, Robert notifications@github.com wrote:

it's a general solution that is actually easy to build. it's just recursive, nothing complex about that. the core code stays the same to read in a line, if it detects the "@" then it just calls itself. adding the "@" capability will be rather straight forward.

— Reply to this email directly or view it on GitHubhttps://github.com/UMPsychMethodsCore/MethodsCore/issues/320#issuecomment-27565680 .

heffjos commented 11 years ago

I agree with Mike that many users would not use the feature; however, if the subject and run numbers are included in data files at all levels, it should be easy to do.

Has anyone done any testing yet?

rcwelsh commented 11 years ago

pseudo code for the recursive call:

 function masterDataTable = masterDataTableRead(DataFileToRead)
 open file
 while reading line from file
    read a line into variable newLine
    if newLine contains "@"
       newDataFileToRead = the file pointed to by @
       masterDataTable = [masterDataTable; masterDatatableRead(newDataFileToRead)];
    else
       masterDataTable = [masterDataTable;parseTheLine(newLine)];
    end
 end

Anyway, it really does not add much to the code, and it allows for easy expandability. I'd like to see it done.

heffjos commented 11 years ago

I have added the code to handle mixed data files.

Also, the FirstLevel_alpha and public branches have diverged, which I think happened after the ConnTool merge.