ebi-ait / hca-to-scea-tools

Helpers and tools to assist in the conversion of HCA datasets into SCEA
0 stars 0 forks source link

Update to hca2scea tool: automatically obtain fastq file paths or SRA file paths #54

Closed ami-day closed 3 years ago

ami-day commented 3 years ago

This is a pull request for an update to the hca-to-scea-tools repo. In particular, the main "script.py" script, and also the addition of a "fetch_fastq_path.py" module. The update enables the script to automatically get the fastq file paths or SRA file paths based on the list of run accessions and also the relevant file names. This information is automatically added into the output sdrf file.

ami-day commented 3 years ago

Thanks, @ami-day, for the PR.

A few comments:

  1. How are the new features tested? I did not notice any change to the test files.
  2. Keeping the code as it is, albeit maybe functionally correct, would be incurring a big technical debt. It would be very difficult for us as a team, and even for you who wrote this, to maintain, modify, fix and extend it with confidence.
  3. One of the test scenarios is currently failing.

Thanks for the quick review @amnonkhen, I agree, I'll next work on adding feature tests for this and also spend some time refactoring the code so it is maintainable. I'll work on this this sprint as I'm on development. Will have a look at why 1 of the tests is failing.

ami-day commented 3 years ago

Hi @amnonkhen I am finished with my refactoring and added some checks for expected output at the end. It seems to me the tests are working with updated expected output files. I'd like to get this pull request merged soon so I can move onto updating it with new features. Shall we go through how you would approach improving the code first? What's best for you?

p.s. it is saying below that all checks have failed. But, when I run the test script with the characteristic test, it is working fine. Im not sure why this is?