Closed rolivella closed 4 years ago
ID | Filename | Checksum | Annotation |
---|---|---|---|
1 | 180130_Q_QC2F_01_01_100ng | e8a53e5831b4991d90eebb5e8c239fb0 | None |
2 | 170815_Q_QC2F_01_06_100ng | 654c5180ce42dab7b9ee2ade6351c2da | None |
3 | 170712_Q_QC2F_01_05_100ng | 6b7c25657eed1a640dd61ff6de67b049 | None |
4 | 170307_Q_QC2F_01_01_100ng | 0c966e2777705e7972f06be05586435b | Column changed & Pre-column changed |
5 | 170118_Q_QC2F_01_01_100ng | 2d80fd7bb43f3b7c11abe5bee3aa6206 | None |
6 | 180814_Q_QC2V_01_01_100ng | 1e39d5856e2a837e3ac8c71c879aa3c8 | None |
7 | 180529_Q_QC2V_01_01_100ng | c143f81073a7250f640c756211b4767d | None |
8 | 180306_Q_QC2V_01_02_100ng | 55525df97ae6a4d80a8f316985b92642 | None |
9 | 170526_Q_QC2V_01_01_100ng | c07544d808880c48fcad9a8f915f44bb | MS calibration |
10 | 170307_Q_QC2V_01_02_100ng | e7d0cbff7baff24b1928fbfe9d1afa8a | Column changed & Pre-column changed |
11 | 170109_Q_QC2V_01_01_100ng | b12f8a8457763d222c51561e8d2b2bf9 | LC maintenance & MS maintenance |
12 | 180409_Q_QC2F_01_01_100ng | 6e3a7e470f328fa35d1738e043d52d3d | None |
13 | 180725_Q_QC2F_01_02_100ng | a668f29b5228fef3cdef216bf16dea9a | None |
/users/pr/qcloud/test/elixir_proteomics_QC_current/output
Also remove "oxibutanol" from variable modifications.
Test with RAW files from other instruments.
In summary, modifications a and checks to do:
Script to generate px file:
BulkPRIDESubmission --folder /users/pr/qcloud/test/pride_submission/files /users/pr/qcloud/test/pride_submission/output
After the script has been modified by Mathias, the way to run it is like this:
BulkPRIDESubmission --folder /users/pr/qcloud/test/pride_submission/files
Mathias created an FTP account for us:
If tried but I cannot connect to it:
Status: Resolving address of ftp-private.ebi.ac.uk
Status: Connecting to 193.62.194.179:21...
Status: Connection established, waiting for welcome message...
Status: Initializing TLS...
Status: Verifying certificate...
Status: TLS connection established.
Command: USER roger.olivella@crg.eu
Response: 530 Permission denied.
Current configuration:
Mail to Mathias:
Pending:
Test EBI FTP account
How to update Mathias script: https://github.com/proteomicsunitcrg/qcloud2-pipeline/issues/45#issuecomment-600502115
According to Mathias:
1) How to access:
1) Still permission denied. Can you login to the FTP with my credentials? Ah, I think the 'issue' was that there are two different FTPs for EBI's PRIDE, IDK why. Your credentials are for the other server, I think. Please do the following: update the tool, container/pip either is fine, and use the pw I provided. Don't worry if login with filezilla is not working. Try with the tool.
2) New password:
ivZ4PK9k with your user handle and folder CRG_bulk_PX
3) Script use:
No need to implement, you just need to tell the tool which folders. Here an example: BulkPRIDESubmission --folder /files/folder/instrumentX/2018/ --folder /files/folder/instrumentY/2018/ --folder /files/folder/instrumentX/2019/
Conclusion:
1) The FTP user and password is working:
2) Starting info could be stored in a md file?
3) I don't know exactly what to put in the --folder param. If i put the output folder I get:
Collecting metadata from files within the given folders.
WARNING:root:Ignoring these directories: files/QC02_6b7c25657eed1a640dd61ff6de67b049,files/QC02_b12f8a8457763d222c51561e8d2b2bf9
Enter provided ftp password: ********
Enter provided ftp folder name: CRG_bulk_PX
Uploading 0 files...
0.0% [=====================================================================================================================>] 0/ ? eta [?:??:??]
Done. Thank you for choosing BulkPRIDESubmission. Have a great day!
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending coro=<Renderer.wait_for_cpr_responses.<locals>.wait_for_timeout() done, defined at /users/pr/qcloud/.local/lib/python3.6/site-packages/prompt_toolkit/renderer.py:504> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f3ab24a87c8>()]>>
If I put the folder where the files are I get:
Collecting metadata from files within the given folders.
Traceback (most recent call last):
File "/users/pr/qcloud/.local/bin/BulkPRIDESubmission", line 11, in <module>
load_entry_point('BulkPrideSubmission==0.0.1', 'console_scripts', 'BulkPRIDESubmission')()
File "/users/pr/qcloud/.local/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/users/pr/qcloud/.local/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/users/pr/qcloud/.local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/users/pr/qcloud/.local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/users/pr/qcloud/.local/lib/python3.6/site-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/users/pr/qcloud/.local/lib/python3.6/site-packages/cli/bulk_pride_submission.py", line 318, in start
logging.warning("Ignoring these unmatched files: {}".format(','.join(unassociated)))
TypeError: sequence item 2: expected str instance, set found
According to Mathias:
The tool is designed to look into each given (
--folder
) folder for files to submit. It will not look into sub-folders. When there are name-matching mzid and raw files, the tool will register all other name-matching files of other types (such as mzQC), too.
To update Mathias script:
Code at: https://gitlab.ebi.ac.uk/walzer/bulk-pride-submission
git pull
pip3 install .
Successfully installed BulkPrideSubmission-0.0.1
Example of metadata:
select bar.creationdate,annotation_code.annotation
from bar inner join annotation_code
on bar.annotation=annotation_code.code
where creationdate between "2017-01-01" and "2017-12-31"
and instrument="f"
and type="hela";
Result:
+---------------------+--------------------------------+ | creationdate | annotation | +---------------------+--------------------------------+ | 2017-03-08 20:15:00 | LC and/or MS Troubleshooting | | 2017-03-08 00:02:00 | Column and/or precolumn change | | 2017-04-03 02:06:00 | LC and/or MS service | | 2017-05-29 05:43:00 | Calibration | | 2017-06-30 02:10:00 | Calibration | | 2017-08-31 13:01:00 | New QC aliquote | | 2017-11-24 00:08:00 | LC and/or MS service | | 2017-11-24 00:08:00 | Calibration | | 2017-11-24 00:08:00 | Cleaning | | 2017-11-28 03:34:00 | LC and/or MS Troubleshooting | | 2017-11-29 04:02:00 | LC and/or MS service | | 2017-12-13 23:15:00 | Column and/or precolumn change | | 2017-12-13 23:15:00 | LC and/or MS Troubleshooting | +---------------------+--------------------------------+
Updated script: Successfully installed BulkPrideSubmission-0.0.1. Could Mathias change the version?
Now it seems to work except from:
qcloud@nextflow:/users/pr/qcloud/test/pride_submission$ BulkPRIDESubmission --prepared input_data.md --folder files/QC02_b12f8a8457763d222c51561e8d2b2bf9/
No prepared submission settings readable, will overwrite!
You will need your sample processing protocol available in a file named `sample_processing_protocol.md`!
You will need your data processing protocol available in a file named `data_processing_protocol.md`!
You will be prompted for a number of informations about the submission.
Ready? [y/N]: y
Here we go!
Please enter your name: Roger Olivella
Please enter your email: roger.olivella@crg.eu
Please enter your affiliation (i.e. institution): CRG
Please enter your username for pride login: roger.olivella@crg.eu
Please enter your lab head's name: Eduard Sabidó
Please enter your lab head's email: eduard.sabido@crg.eu
Please enter your lab head's affiliation (i.e. institution): CRG
Please enter your a project title: test 4
Please enter your a concise project description: test 4
Ok? [y/N]: y
Keywords (comma sparated, finalised by enter): test, qcloud
Your keywords test,qcloud
Ok? [y/N]: y
Are all your samples of one organism and one tissue type? [y/N]: y
Enter species type: Homo sapiens (Human)
Enter tissue type: HeLa cell
Enter experiment type: Proteogenomics
Collecting metadata from files within the given folders.
['files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.ok.mzML', 'files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.mzid', 'files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.mzQC', 'files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.featureXML', 'files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.qcml']
WARNING:root:Ignoring these unmatched files: files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.ok.mzML,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.mzid,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.mzQC,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.featureXML,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.qcml
Enter provided ftp password: ********
Enter provided ftp folder name: CRG_bulk_PX
Uploading 0 files...
0.0% [=====================================================================================================================>] 0/ ? eta [?:??:??]
Done. Thank you for choosing BulkPRIDESubmission. Have a great day!
**ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending coro=<Renderer.wait_for_cpr_responses.<locals>.wait_for_timeout() done, defined at /users/pr/qcloud/.local/lib/python3.6/site-packages/prompt_toolkit/renderer.py:504> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f00005ab5e8>()]>>**
I'm trying to understand the issue with the ".ok.". As I mentioned before, the warning I get is this:
WARNING:root:Ignoring these unmatched files: files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.ok.mzML,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.mzid,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.featureXML,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.qcml,files/QC02_b12f8a8457763d222c51561e8d2b2bf9/QC02_b12f8a8457763d222c51561e8d2b2bf9.mzQC
But in the folder:
/QC02_b12f8a8457763d222c51561e8d2b2bf9 I already have the file
QC02_b12f8a8457763d222c51561e8d2b2bf9.ok.mzML
which is the one referenced in the mzID, so where's the problem?
According to Mathias: "lett me explain how I went about the matching. I require the tool to find file base name matching triplets from raw, mzid, and mzml so that I have a direct line of progenitor files. I think, having that is essential. "
However currently I'm not including the raw file in the output, so this could the reason why I can't upload to the FTP?
Toy dataset completly tested.