NationalGenomicsInfrastructure / ngi_pipeline

Code driving the production pipeline at SciLifeLab
6 stars 24 forks source link

Uppsala projects can be added to Charon by parsing the filesystem #81

Closed mariogiov closed 9 years ago

mariogiov commented 10 years ago

I was adding some Uppsala projects to Charon today so I could try to process them, and I was doing this just using the information from the filesystem (e.g. the names of directories and so on). It occurs to me that this may be a much easier way of doing things for Uppsala projects than to have them send us files to upload -- also it seems there would be less risk of introducing human error.

Does anyone have a reason why this isn't a good idea? The scripts are basically done, I would just need to fix them up a bit so they're less hacky.

johandahlberg commented 10 years ago

I support this 100%. When me and @mariogiov started talking about it after the meeting today, we couldn't really see a reason not to. And once we have LIMS with a REST API in place we can pull the info from there instead.

pekrau commented 10 years ago

OK, sounds good.

Question 1: It is not required to add projects to Charon before the data appears in the filesystem?

Question 2: Project id's (and sample, etc) are always the same on filesystem as on the Uppsala LIMS?

/Per K

2014-09-02 17:58 GMT+02:00 Johan Dahlberg notifications@github.com:

I support this 100%. When me and @mariogiov https://github.com/mariogiov started talking about it after the meeting today, we couldn't really see a reason not to. And once we have LIMS with a REST API in place we can pull the info from there instead.

— Reply to this email directly or view it on GitHub https://github.com/NationalGenomicsInfrastructure/ngi_pipeline/issues/81#issuecomment-54174180 .

Per Kraulis, Ph.D. Systems Architect, National Genomics Infrastructure (NGI), SciLifeLab. Dept Biochemistry and Biophysics, Stockholm University. per.kraulis@scilifelab.se, +46 (0)8 5248 1465, http://www.scilifelab.se/ Visiting address: Tomtebodavägen 23A, Karolinska Institutet Science Park, Solna Mailing address: SciLifeLab Stockholm, Box 1031, 171 21 Solna, Sweden

johandahlberg commented 10 years ago

I can answer Question 2: They should be, as the names are pulled from the sample sheets, which should correspond to the LIMS.

mariogiov commented 10 years ago

Regarding question 1, I don't think this is really required. The only thing I actually use Charon for in the very early steps (when parsing the flowcell before rearranging it into project/sample/libprep/seqrun) is to determine the library prep ID. In fact, we could eliminate this step entirely if the Sthlm SampleSheet.csv files also had the library prep information in them somewhere; however, this isn't required, and I hesitate to ask the lab folks to change things unless we really need them to (why use up favors for something unnecessary).

mariogiov commented 10 years ago

@vezzi might think of something I don't though

vezzi commented 10 years ago

I think your solution is neath and it is the safest way to go, how you take the info about the lib prep from uppsala, you assume it to be "A" or is this info available?

For now we need a working version for Uppsala, we will have a cleaner and stable solution when Uppsala will have the LIMS in production (couple of months).

I would not completely forget about the approach of updating the LIMS using a script from Uppsala side as this will give the opportunity to have some information before the data is produced. For example, if a library will be sequenced in two flowcell, Stockholm will be able to visualise this info as soon as it is in the LIMS, Uppsala will be able to visualise this info only after the flowcell has been delivered. This is the only problem I can see....

I propose to implement @mariogiov version in order to have a working pipeline asap, and to see if with a reasonable effort @pekrau and @Galithil can produce tool for uppsala to do the update in an asynchronous way

johandahlberg commented 10 years ago

Library perp info is in the samplesheet.

mariogiov commented 10 years ago

Agreed, there could be times when the extra meta-info about a project could be handy to have. I'll fix this up for now and we can await a better solution from Per and Denis / the Uppsala folks. Probably a lot easier(?) once they get their Genologics LIMS.