EOSC-LOFAR / prefactor-cwl

CWL version of prefactor
MIT License
3 stars 3 forks source link

make run results in RuntimeError: TableProxy::getColumn: column AZEL1 does not exist #23

Open HannoSpreeuw opened 5 years ago

HannoSpreeuw commented 5 years ago

make run after git pull (master branch), make small and systemctl start docker.service gives:

........ ........ import os import sys inputfiles = sys.argv[1:] print(inputfiles) Isub_deep.main(inputfiles, outmapname="cwl", mapfile_dir=os.path.basename(".")) ' \ /var/lib/cwl/stgcda8f639-672c-4c0d-819c-e78f8967b048/calibrated.MS \ /var/lib/cwl/stg83de6ff0-0abb-49b9-820f-12d9cf6949ce/calibrated.MS \ /var/lib/cwl/stgcb6f29f3-dc46-405d-8565-4fdf85db8e20/calibrated.MS \ /var/lib/cwl/stgce010a2f-bc0c-4264-b22e-f87b39efa57b/calibrated.MS \ /var/lib/cwl/stg06cb6b30-e688-464c-9711-4a1470e32e93/calibrated.MS \ /var/lib/cwl/stgaf01f57d-e46f-4546-a699-588e5f1e68ee/calibrated.MS \ /var/lib/cwl/stgfd9c1107-3163-45a1-8c17-88288bb77390/calibrated.MS \ /var/lib/cwl/stg6db08ff4-527a-488b-84f2-a67f7d1c6cb4/calibrated.MS \ /var/lib/cwl/stg07badb29-eda4-44d9-8ffa-ba8da4e4315b/calibrated.MS \ /var/lib/cwl/stg7b0e0b16-a804-449a-87a8-a7a8c8ff29dd/calibrated.MS \ /var/lib/cwl/stgd209a638-9f91-406b-9023-b0d7b653f3ef/calibrated.MS \ /var/lib/cwl/stg0a2386fd-6e6c-4147-b748-891c288958a3/calibrated.MS \ /var/lib/cwl/stg005912ea-0fb2-4ed1-a708-81b88a3d2122/calibrated.MS \ /var/lib/cwl/stg9a028d09-5f53-4c1e-8c43-a31b6d36e082/calibrated.MS \ /var/lib/cwl/stg80a067b9-b27b-4264-b7ac-996fdceb601a/calibrated.MS \ /var/lib/cwl/stg89d5788c-5c69-4a8c-9fb2-611e596daff5/calibrated.MS \ /var/lib/cwl/stg05773b1e-12f4-4f57-b10d-9867ddb331ae/calibrated.MS \ /var/lib/cwl/stg567ea26d-72c8-4939-b68c-f5f8982afdd9/calibrated.MS \ /var/lib/cwl/stg3b3e0775-94ba-42a9-bb50-a6eb1a9ffa2a/calibrated.MS \ /var/lib/cwl/stgb89c056a-b8b5-4981-9728-3804d6378469/calibrated.MS /usr/lib/python2.7/dist-packages/lofarpipe/support/utilities.pyc : Using default subprocess module! Traceback (most recent call last): File "", line 8, in File "/usr/lib/prefactor/scripts/InitSubtract_deep_sort_and_compute.py", line 322, in main fieldsize_highres, fieldsize_lowres) File "/usr/lib/prefactor/scripts/InitSubtract_deep_sort_and_compute.py", line 71, in get_image_sizes ['/var/lib/cwl/stgcda8f639-672c-4c0d-819c-e78f8967b048/calibrated.MS', '/var/lib/cwl/stg83de6ff0-0abb-49b9-820f-12d9cf6949ce/calibrated.MS', '/var/lib/cwl/stgcb6f29f3-dc46-405d-8565-4fdf85db8e20/calibrated.MS', '/var/lib/cwl/stgce010a2f-bc0c-4264-b22e-f87b39efa57b/calibrated.MS', '/var/lib/cwl/stg06cb6b30-e688-464c-9711-4a1470e32e93/calibrated.MS', '/var/lib/cwl/stgaf01f57d-e46f-4546-a699-588e5f1e68ee/calibrated.MS', '/var/lib/cwl/stgfd9c1107-3163-45a1-8c17-88288bb77390/calibrated.MS', '/var/lib/cwl/stg6db08ff4-527a-488b-84f2-a67f7d1c6cb4/calibrated.MS', '/var/lib/cwl/stg07badb29-eda4-44d9-8ffa-ba8da4e4315b/calibrated.MS', '/var/lib/cwl/stg7b0e0b16-a804-449a-87a8-a7a8c8ff29dd/calibrated.MS', '/var/lib/cwl/stgd209a638-9f91-406b-9023-b0d7b653f3ef/calibrated.MS', '/var/lib/cwl/stg0a2386fd-6e6c-4147-b748-891c288958a3/calibrated.MS', '/var/lib/cwl/stg005912ea-0fb2-4ed1-a708-81b88a3d2122/calibrated.MS', '/var/lib/cwl/stg9a028d09-5f53-4c1e-8c43-a31b6d36e082/calibrated.MS', '/var/lib/cwl/stg80a067b9-b27b-4264-b7ac-996fdceb601a/calibrated.MS', '/var/lib/cwl/stg89d5788c-5c69-4a8c-9fb2-611e596daff5/calibrated.MS', '/var/lib/cwl/stg05773b1e-12f4-4f57-b10d-9867ddb331ae/calibrated.MS', '/var/lib/cwl/stg567ea26d-72c8-4939-b68c-f5f8982afdd9/calibrated.MS', '/var/lib/cwl/stg3b3e0775-94ba-42a9-bb50-a6eb1a9ffa2a/calibrated.MS', '/var/lib/cwl/stgb89c056a-b8b5-4981-9728-3804d6378469/calibrated.MS'] InitSubtract_deep_sort_and_compute.py: Putting files into bands. InitSubtract_deep_sort_and_compute.py: Working on Band: 120 global_el_values = tab.getcol('AZEL1', rowincr=10000)[:, 1] File "/usr/lib/python2.7/dist-packages/casacore/tables/table.py", line 1006, in getcol return self._getcol(columnname, startrow, nrow, rowincr) RuntimeError: TableProxy::getColumn: column AZEL1 does not exist [job do_magic] Job error: Error collecting output for parameter 'mapfile_paths': steps/do_magic.cwl:20:7: Did not find output file with glob pattern: '['cwl']' [job do_magic] completed permanentFail [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_paths [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_deep_high_padded [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_deep_high_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_deep_low_padded_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_deep_low_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_freqstep [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_high_padded_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_high_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_low_padded_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_low_size [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_nbands [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_nchansout_clean1 [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_nwavelengths_high [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_nwavelengths_low [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_single [step do_magic] Output is missing expected field file:///some_path/prefactor-cwl/prefactor.cwl#do_magic/mapfile_timestep [step do_magic] completed permanentFail [workflow prefactor.cwl] completed permanentFail { "psf": null, "losoto_h5": { "format": "http://revoltek.github.io/losoto/cookbook.pdf", "checksum": "sha1$23357758aede973375e4d7223c39a321407443e3", "basename": "losoto.h5", "location": "file:///some_path/prefactor-cwl/runs/run_2018-10-01-13-03-15/results/losoto.h5", "path": "/some_path/prefactor-cwl/runs/run_2018-10-01-13-03-15/results/losoto.h5", "class": "File", "size": 2507960 }, "dtec_allsols": null, "dclock_1st": null, "phase_array": null, "amplitude_array": null, "phase_xx_yy_offset": null, "amp_allsols": null, "dclock_allsols": null, "freqs_for_phase_array": null, "dclock_1st_sm": null, "residual": null, "polYY_dirpointing": null, "dirty": null, "model": null, "station_names": null, "dTEC_1st": null, "polXX_dirpointing": null, "dTEC_1st_sm": null } Final process status is permanentFail make: *** [Makefile:57: run] Error 1

HannoSpreeuw commented 5 years ago

Same error from make run-udocker

gijzelaerr commented 5 years ago

we discussed this already, the error is: TableProxy::getColumn: column AZEL1 does not exist

Which is or a bug in prefactor, or that table is required and missing in the MS you are using.

HannoSpreeuw commented 5 years ago

I tried a different MS: L429550/L429550_SAP000_SB000_uv.MS. make run-singularity --debug now gives a different error:

[workflow ] start [workflow ] starting step ndppp_prep_cal [step ndppp_prep_cal] start [job ndppp_prep_cal] Output of job will be cached in ....... Using local copy of Singularity image found in ...... [job ndppp_prep_cal] ..... --quiet \ exec \ --contain \ --pid \ --ipc \ --bind \ ......./prefactor-cwl/cache/b4c29506c6199e9807903d51178cb6d5:/var/spool/cwl:rw \ --bind \ /tmp/tmpER5Tlt:/tmp:rw \ --bind \ ....../data/L429550/L429550_SAP000_SB000_uv.MS:/var/lib/cwl/stg63c86b06-af69-4494-8416-a59a9049b781/L429550_SAP000_SB000_uv.MS:ro \ --pwd \ /var/spool/cwl \ ....../prefactor-cwl/kernsuite-prefactor.img \ NDPPP \ msout=calibrated.MS \ average.freqresolution=48.82kHz \ avg.freqstep=2 \ average.timeresolution=4 \ avg.timestep=2 \ avg.type=average \ baseline=[CS013HBA] \ 'filter.baseline=CS, RS&&' \ filter.remove=True \ filter.type=filter \ 'flag.baseline=[ CS013HBA ]' \ flag.type=filter \ flagamp.amplmin=1e-30 \ flagamp.type=preflagger \ msin=/var/lib/cwl/stg63c86b06-af69-4494-8416-a59a9049b781/L429550_SAP000_SB000_uv.MS \ msin.datacolumn=DATA \ msout.overwrite=True \ msout.writefullresflag=False \ steps=[flag,filter,avg,flagamp] log4cplus:ERROR No appenders could be found for logger (LCS.Common). log4cplus:ERROR Please initialize the log4cplus system properly.

uncaught exception

Backtrace follows:

0 0x7faf83c57684 in ?? at ??:0

1 0x7faf839456b6 in ?? at ??:0

2 0x7faf83945701 in ?? at ??:0

3 0x7faf83945919 in ?? at ??:0

4 0x7faf840f441d in ?? at ??:0

5 0x7faf840f49a8 in ?? at ??:0

6 0x7faf80ff9c6e in ?? at ??:0

7 0x7faf80f2c838 in ?? at ??:0

8 0x7faf80f4bf88 in ?? at ??:0

9 0x7faf80f95ebb in ?? at ??:0

10 0x7faf80f97d74 in ?? at ??:0

11 0x7faf80f981d3 in ?? at ??:0

12 0x7faf81fc36c0 in ?? at ??:0

13 0x7faf81f5c199 in ?? at ??:0

14 0x7faf8451c5fe in ?? at ??:0

15 0x7faf844db709 in ?? at ??:0

16 0x7faf844dd755 in ?? at ??:0

17 0x404e9a in ?? at ??:0

18 0x7faf832f8830 in ?? at ??:0

19 0x4055e9 in ?? at ??:0

terminate called after throwing an instance of 'casa::AipsError' what(): Shared library lofarstman not found in CASACORE_LDPATH or (DY)LD_LIBRARY_PATH libcasa_lofarstman.so.2: cannot open shared object file: No such file or directory libcasa_lofarstman.so: cannot open shared object file: No such file or directory liblofarstman.so.2: cannot open shared object file: No such file or directory liblofarstman.so: cannot open shared object file: No such file or directory

[job ndppp_prep_cal] Job error: Error collecting output for parameter 'msout': steps/ndppp_prep_cal.cwl:164:7: Did not find output file with glob pattern: '[u'calibrated.MS']' [job ndppp_prep_cal] completed permanentFail [step ndppp_prep_cal] completed permanentFail [workflow ] completed permanentFail { "psf": null, "losoto_h5": null, "dtec_allsols": null, "dclock_1st": null, "phase_array": null, "amplitude_array": null, "phase_xx_yy_offset": null, "amp_allsols": null, "dclock_allsols": null, "freqs_for_phase_array": null, "dclock_1st_sm": null, "residual": null, "polYY_dirpointing": null, "dirty": null, "model": null, "station_names": null, "dTEC_1st": null, "polXX_dirpointing": null, "dTEC_1st_sm": null } Final process status is permanentFail make: *** [Makefile:66: run-singularity] Error 1

gijzelaerr commented 5 years ago

Looks like DP3 is dynamically loading liblofarstman, which is not installed in the image that is on docker hub.

ygrange commented 5 years ago

This is a known issue with the previously unpatched version of prefactor (cf. discussion on our Slack channel). So you may wanrt to rebuild the singularity container based on the version that @gijzelaerr created on the 28th of September.

HannoSpreeuw commented 5 years ago

Is that also master branch? Because I am running the master branch. If yes, have his changes been reverted?

HannoSpreeuw commented 5 years ago

Same error from make run-udocker as expected.

ygrange commented 5 years ago

The master branch of this repo? Yeah. It is in essence the newer version of the apt-get package in kern that you'd need. So even though Docker tells you the image is recent, you actually need to trigger the apt-get install again if your image is from before 1st of october (give or take). So an explicit docker pull kernsuite/prefactor should probably do the trick here (I hope).

HannoSpreeuw commented 5 years ago

I don't get it. These pulls are performed automatically after make run-udocker and make run-singularity so I don't see why I should do anything extra since I've done a fresh git clone today.

gijzelaerr commented 5 years ago

i tihnk singularity pulls a docker image from the docker hub, it might be that that image is not updated...

HannoSpreeuw commented 5 years ago

Same error, but perhaps I need to delete the previous image? docker pull kernsuite/prefactor Using default tag: latest latest: Pulling from kernsuite/prefactor Digest: sha256:ffc19824ea872736606c6dc9eca4bf5a04e840e5fcc2b4f478144ef9d2c6f11a Status: Image is up to date for kernsuite/prefactor:latest

HannoSpreeuw commented 5 years ago

Removed old docker image docker rmi .... docker pull kernsuite/prefactor Using default tag: latest latest: Pulling from kernsuite/prefactor ae79f2514705: Pull complete c59d01a7e4ca: Pull complete 41ba73a9054d: Pull complete f1bbfd495cc1: Pull complete 0c346f7223e2: Pull complete 760bbf825191: Pull complete d92598cf94b0: Pull complete 44bd5d573b1b: Pull complete 92c8ae71df9e: Pull complete 7c66b3b256d7: Pull complete a22c18e00204: Pull complete a3a761fd9e05: Pull complete Digest: sha256:ffc19824ea872736606c6dc9eca4bf5a04e840e5fcc2b4f478144ef9d2c6f11a Status: Downloaded newer image for kernsuite/prefactor:latest

make run-udocker

same error.

ygrange commented 5 years ago

Are you pulling the Docker container from docker hub, or are you doing: docker build docker -t kernsuite/prefactor from the repo? The latter should build the most current version of kern in an image (if my monday-afternoon brain is still functional enough).

gijzelaerr commented 5 years ago

udocker and singularity only support remote repositories, they can't import directly from your local docker cache.

I've pushed an updated Dockerfile to the repo and I've updated the image on docker hub. If i run this on my computer I get the same error column AZEL1 does not exist with the example dataset in the repo.

HannoSpreeuw commented 5 years ago

Thanks. But try a new LOFAR dataset and you'll get the

Shared library lofarstman not found in CASACORE_LDPATH or (DY)LD_LIBRARY_PATH

mr-c commented 5 years ago

@HannoSpreeuw Which dataset, specifically?

ygrange commented 5 years ago

@gijzelaerr: I just pulled the kernsuite/prefactor container in and if I look at the source, the line causing this issue has not been patched (for reference, this is the patch: https://github.com/lofar-astron/prefactor/commit/dd0ec767ca4b20206155bb5100455697c712877b ).

lofarstman may be a different error alltogether. I think we may need to call in the cavalry (@tammojan) in for this. I know there is some magic step here but I can't really remember which.

HannoSpreeuw commented 5 years ago

@mr-c answered through dm.

gijzelaerr commented 5 years ago

@gijzelaerr: I just pulled the kernsuite/prefactor container in and if I look at the source, the line causing this issue has not been patched (for reference, this is the patch: lofar-astron/prefactor@dd0ec76 ).

lofarstman may be a different error alltogether. I think we may need to call in the cavalry (@tammojan) in for this. I know there is some magic step here but I can't really remember which.

Ok this is getting such a mess. looks like this is an unreleased commit on the master branch? KERN-dev contains prefactor 2.0.3. You recently requested this fix for the 2.0.2 package in KERN-3, I don't automatically upload that to kern-dev.

probably best to just make a prefactor release with this patch and not let people use unreleased software.

ygrange commented 5 years ago

The best I can do is fork it off and make my own release of 2.0.3 cherry-picking this fix.

Not sure if anybody in the prefactor team has any time to fix this.

gijzelaerr commented 5 years ago

It is very easy to make a 2.0.3.1 which would be 2.0.3 with this fix applied. We can have a look later today, I'll be at astron

On Mon, 15 Oct 2018, 20:55 Yan Grange, notifications@github.com wrote:

The best I can do is fork it off and make my own release of 2.0.3 cherry-picking this fix.

Not sure if anybody in the prefactor team has any time to fix this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EOSC-LOFAR/prefactor-cwl/issues/23#issuecomment-429972837, or mute the thread https://github.com/notifications/unsubscribe-auth/AAT6pGtrhXrorHwzz-7B85H85Z4ysVSDks5ulNohgaJpZM4XB6NE .

tammojan commented 5 years ago

https://github.com/lofar-astron/prefactor/releases/tag/V2.0.3.1

Please have a look, I marked it as prerelease for now.

gijzelaerr commented 5 years ago

https://github.com/kernsuite/packaging/issues/155

gijzelaerr commented 5 years ago

this issue has been fixed in KERN-3, KERN-4 and KERN-dev.

HannoSpreeuw commented 5 years ago

This error persists for apparently any dataset different from prefactor-cwl's test datasets. A possible fix related to dysco should most likely be implemented from within the container which is beyond the reach of the prefactor team.